Pandas 某些列中的每一行基于其他列

Question

我有一个 DataFrame 在一列或多列中有缺失值，我正在尝试根据同一记录的其他两列中的值查找缺失值（即填充 city_name 和city_id 列基于 lat 和 long 列中的坐标。）

city_name	city_id	lat	long
NaN	NaN	-121.77	37.24
NaN	NaN	-122.77	38.24
NaN	NaN	-123.77	39.24
new york	c1	-121.77	37.24
paris	c2	-122.77	38.24
london	c3	-123.77	39.24

我该怎么做？

Answer 1

试试 groupby 和 fillna:

df = df.fillna(df.groupby(["lat", "long"]).transform("first"))

>>> df

  city_id city_name     lat   long
0      c1  new york -121.77  37.24
1      c2     paris -122.77  38.24
2      c3    london -123.77  39.24
3      c1  new york -121.77  37.24
4      c2     paris -122.77  38.24
5      c3    london -123.77  39.24

Pandas 某些列中的每一行基于其他列

Pandas for each row in some columns based on other columns

python

nan

pandas