Merging GeoDataFrames - TypeError: float() argument must be a string or a number, not 'Point'

Merging GeoDataFrames - TypeError: float() argument must be a string or a number, not 'Point'

我有一个数据框,其中一列有一系列形状优美的点,另一列有一系列多边形。

df.head()

                    

     hash number                               street unit  \
2024459  283e04eca5c4932a     SN  AVENIDA DOUTOR SEVERIANO DE ALMEIDA  NaN   
2024460  1a92a1c3cba7941a    485  AVENIDA DOUTOR SEVERIANO DE ALMEIDA  NaN   
2024461  837341c45de519a3    475  AVENIDA DOUTOR SEVERIANO DE ALMEIDA  NaN   

            city  district region   postcode  id                     geometry  
2024459  Jaguari       NaN     RS  97760-000 NaN  POINT (-54.69445 -29.49421)  
2024460  Jaguari       NaN     RS  97760-000 NaN  POINT (-54.69445 -29.49421)  
2024461  Jaguari       NaN     RS  97760-000 NaN  POINT (-54.69445 -29.49421)

poly_df.head()
                                          centroids                                           geometry
0   POINT (-29.31067315122428 -54.64176359828149)  POLYGON ((-54.64069 -29.31161, -54.64069 -29.3...
1   POINT (-29.31067315122428 -54.63961783106958)  POLYGON ((-54.63854 -29.31161, -54.63854 -29.3...
2  POINT (-29.31067315122428 -54.637472063857665)  POLYGON ((-54.63640 -29.31161, -54.63640 -29.3...

我正在检查点是否属于多边形并将点对象插入到第二个数据帧的单元格中。但是,我收到以下错误:

Traceback (most recent call last):
   
  File "/tmp/ipykernel_4771/1967309101.py", line 1, in <module>
    df.loc[idx, 'centroids'] = poly_mun.loc[ix, 'centroids']

  File ".local/lib/python3.8/site-packages/pandas/core/indexing.py", line 692, in __setitem__
    iloc._setitem_with_indexer(indexer, value, self.name)

  File ".local/lib/python3.8/site-packages/pandas/core/indexing.py", line 1599, in _setitem_with_indexer
    self.obj[key] = infer_fill_value(value)

  File ".local/lib/python3.8/site-packages/pandas/core/dtypes/missing.py", line 516, in infer_fill_value
    val = np.array(val, copy=False)

TypeError: float() argument must be a string or a number, not 'Point'

我正在使用以下命令行:

df.loc[idx, 'centroids'] = poly_df.loc[ix, 'centroids']

我也试过了at

谢谢

您无法使用 loc:

在 pandas 中创建具有匀称几何形状的新列
In [1]: import pandas as pd, shapely.geometry

In [2]: df = pd.DataFrame({'mycol': [1, 2, 3]})

In [3]: df.loc[0, "centroid"] = shapely.geometry.Point([0, 0])
/Users/mikedelgado/opt/miniconda3/envs/rhodium-env/lib/python3.10/site-packages/pandas/core/indexing.py:1642: ShapelyDeprecationWarning: The array interface is deprecated and will no longer work in Shapely 2.0. Convert the '.coords' to a numpy array instead.
  self.obj[key] = infer_fill_value(value)
/Users/mikedelgado/opt/miniconda3/envs/rhodium-env/lib/python3.10/site-packages/pandas/core/dtypes/missing.py:550: FutureWarning: The input object of type 'Point' is an array-like implementing one of the corresponding protocols (`__array__`, `__array_interface__` or `__array_struct__`); but not a sequence (or 0-D). In the future, this object will be coerced as if it was first converted using `np.array(obj)`. To retain the old behaviour, you have to either modify the type 'Point', or assign to an empty array created with `np.empty(correct_shape, dtype=object)`.
  val = np.array(val, copy=False)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Input In [3], in <cell line: 1>()
----> 1 df.loc[0, "centroid"] = shapely.geometry.Point([0, 0])

File ~/opt/miniconda3/envs/rhodium-env/lib/python3.10/site-packages/pandas/core/indexing.py:716, in _LocationIndexer.__setitem__(self, key, value)
    713 self._has_valid_setitem_indexer(key)
    715 iloc = self if self.name == "iloc" else self.obj.iloc
--> 716 iloc._setitem_with_indexer(indexer, value, self.name)

File ~/opt/miniconda3/envs/rhodium-env/lib/python3.10/site-packages/pandas/core/indexing.py:1642, in _iLocIndexer._setitem_with_indexer(self, indexer, value, name)
   1639     self.obj[key] = empty_value
   1641 else:
-> 1642     self.obj[key] = infer_fill_value(value)
   1644 new_indexer = convert_from_missing_indexer_tuple(
   1645     indexer, self.obj.axes
   1646 )
   1647 self._setitem_with_indexer(new_indexer, value, name)

File ~/opt/miniconda3/envs/rhodium-env/lib/python3.10/site-packages/pandas/core/dtypes/missing.py:550, in infer_fill_value(val)
    548 if not is_list_like(val):
    549     val = [val]
--> 550 val = np.array(val, copy=False)
    551 if needs_i8_conversion(val.dtype):
    552     return np.array("NaT", dtype=val.dtype)

TypeError: float() argument must be a string or a real number, not 'Point'

本质上,pandas 不知道如何解释点对象,因此创建了一个带有 NaN 的浮点列,然后无法处理点。这可能会在未来得到修复,但你最好将列明确定义为对象数据类型:

In [27]: df['centroid'] = None

In [28]: df['centroid'] = df['centroid'].astype(object)

In [29]: df
Out[29]:
   mycol centroid
0      1     None
1      2     None
2      3     None

In [30]: df.loc[0, "centroid"] = shapely.geometry.Point([0, 0])
/Users/mikedelgado/opt/miniconda3/envs/rhodium-env/lib/python3.10/site-packages/pandas/core/internals/managers.py:304: ShapelyDeprecationWarning: The array interface is deprecated and will no longer work in Shapely 2.0. Convert the '.coords' to a numpy array instead.
  applied = getattr(b, f)(**kwargs)

In [31]: df
Out[31]:
   mycol     centroid
0      1  POINT (0 0)
1      2         None
2      3         None

也就是说,根据点是否在多边形中来连接两个具有多边形和点的 GeoDataFrames 听起来确实像是 geopandas.sjoin:

的工作
union = gpd.sjoin(polygon_df, points_df, op='contains')