pandas DataFrame从列中提取坐标信息并在列表中形成经纬度对列表
pandas DataFrame extracting coordinate information from a column and forming a list of lat & lon pairs in a list
我有一个带有 text/tuple 列的 pandas 数据框,如所附屏幕截图所示。
下面也是列中数据的示例:
Colum title - POLYGON_WKT_TEXT
POLYGON ( (-105.01884585094353 39.62333777125623,
-105.01851820478282 39.62333686626711,
-105.0185192106112 39.62315273546345,
-105.01888004910847 39.6231533822067,
-105.01888071966073 39.62322879067289,
-105.01884585094353 39.62322827417681,
-105.01884585094353 39.62333777125623) )
POLYGON ((-106.83036867299995 39.19331872400005,
-106.83027684299998 39.19329631000005,
-106.83034537399999 39.19313263400005,
-106.83060769199994 39.19318738000004,
-106.83056232299998 39.19329573700003,
-106.83052058199996 39.19328554900005,
-106.83048588899999 39.19336841100005,
-106.83036066599999 39.19333784600008,
-106.83036867299995 39.19331872400005))
...
...
我希望此字段采用以下格式:
column name - POLYGON_WKT_TXT
[(-105.01884585094353 39.62333777125623), (-105.01851820478282 39.62333686626711), ...(-106.83036867299995 39.19331872400005)]
到目前为止,我已经尝试将逗号 (",") 分成多列,但问题是列中值的长度变化最终导致我的解决方案效率不高。
预先感谢您以优雅的方式解决此任务。
- 如果我理解你的问题,这很简单
- 根据问题中提供的 WKT 文本创建数据框
- 由此创建 列表 元组 。这就像使用 https://shapely.readthedocs.io/en/stable/manual.html#shapely.wkt.loads 和
exterior.coords
一样简单
- 已提供图像输出,因为它的格式不如 markdown
import shapely.wkt
import pandas as pd
df = pd.DataFrame({"polygon_wkt_txt":["""POLYGON ( (-105.01884585094353 39.62333777125623,
-105.01851820478282 39.62333686626711,
-105.0185192106112 39.62315273546345,
-105.01888004910847 39.6231533822067,
-105.01888071966073 39.62322879067289,
-105.01884585094353 39.62322827417681,
-105.01884585094353 39.62333777125623) )""",
"""POLYGON ((-106.83036867299995 39.19331872400005,
-106.83027684299998 39.19329631000005,
-106.83034537399999 39.19313263400005,
-106.83060769199994 39.19318738000004,
-106.83056232299998 39.19329573700003,
-106.83052058199996 39.19328554900005,
-106.83048588899999 39.19336841100005,
-106.83036066599999 39.19333784600008,
-106.83036867299995 39.19331872400005))"""]})
df["tuple_list"] = df["polygon_wkt_txt"].apply(lambda txt: list(shapely.wkt.loads(txt).exterior.coords))
df
我有一个带有 text/tuple 列的 pandas 数据框,如所附屏幕截图所示。
下面也是列中数据的示例:
Colum title - POLYGON_WKT_TEXT
POLYGON ( (-105.01884585094353 39.62333777125623,
-105.01851820478282 39.62333686626711,
-105.0185192106112 39.62315273546345,
-105.01888004910847 39.6231533822067,
-105.01888071966073 39.62322879067289,
-105.01884585094353 39.62322827417681,
-105.01884585094353 39.62333777125623) )
POLYGON ((-106.83036867299995 39.19331872400005,
-106.83027684299998 39.19329631000005,
-106.83034537399999 39.19313263400005,
-106.83060769199994 39.19318738000004,
-106.83056232299998 39.19329573700003,
-106.83052058199996 39.19328554900005,
-106.83048588899999 39.19336841100005,
-106.83036066599999 39.19333784600008,
-106.83036867299995 39.19331872400005))
...
...
我希望此字段采用以下格式:
column name - POLYGON_WKT_TXT
[(-105.01884585094353 39.62333777125623), (-105.01851820478282 39.62333686626711), ...(-106.83036867299995 39.19331872400005)]
到目前为止,我已经尝试将逗号 (",") 分成多列,但问题是列中值的长度变化最终导致我的解决方案效率不高。
预先感谢您以优雅的方式解决此任务。
- 如果我理解你的问题,这很简单
- 根据问题中提供的 WKT 文本创建数据框
- 由此创建 列表 元组 。这就像使用 https://shapely.readthedocs.io/en/stable/manual.html#shapely.wkt.loads 和
exterior.coords
一样简单
- 已提供图像输出,因为它的格式不如 markdown
import shapely.wkt
import pandas as pd
df = pd.DataFrame({"polygon_wkt_txt":["""POLYGON ( (-105.01884585094353 39.62333777125623,
-105.01851820478282 39.62333686626711,
-105.0185192106112 39.62315273546345,
-105.01888004910847 39.6231533822067,
-105.01888071966073 39.62322879067289,
-105.01884585094353 39.62322827417681,
-105.01884585094353 39.62333777125623) )""",
"""POLYGON ((-106.83036867299995 39.19331872400005,
-106.83027684299998 39.19329631000005,
-106.83034537399999 39.19313263400005,
-106.83060769199994 39.19318738000004,
-106.83056232299998 39.19329573700003,
-106.83052058199996 39.19328554900005,
-106.83048588899999 39.19336841100005,
-106.83036066599999 39.19333784600008,
-106.83036867299995 39.19331872400005))"""]})
df["tuple_list"] = df["polygon_wkt_txt"].apply(lambda txt: list(shapely.wkt.loads(txt).exterior.coords))
df