Pandas 某些列中的每一行基于其他列
Pandas for each row in some columns based on other columns
我有一个 DataFrame 在一列或多列中有缺失值,我正在尝试根据同一记录的其他两列中的值查找缺失值(即填充 city_name
和city_id
列基于 lat
和 long
列中的坐标。)
city_name
city_id
lat
long
NaN
NaN
-121.77
37.24
NaN
NaN
-122.77
38.24
NaN
NaN
-123.77
39.24
new york
c1
-121.77
37.24
paris
c2
-122.77
38.24
london
c3
-123.77
39.24
我该怎么做?
试试 groupby
和 fillna
:
df = df.fillna(df.groupby(["lat", "long"]).transform("first"))
>>> df
city_id city_name lat long
0 c1 new york -121.77 37.24
1 c2 paris -122.77 38.24
2 c3 london -123.77 39.24
3 c1 new york -121.77 37.24
4 c2 paris -122.77 38.24
5 c3 london -123.77 39.24
我有一个 DataFrame 在一列或多列中有缺失值,我正在尝试根据同一记录的其他两列中的值查找缺失值(即填充 city_name
和city_id
列基于 lat
和 long
列中的坐标。)
city_name | city_id | lat | long |
---|---|---|---|
NaN | NaN | -121.77 | 37.24 |
NaN | NaN | -122.77 | 38.24 |
NaN | NaN | -123.77 | 39.24 |
new york | c1 | -121.77 | 37.24 |
paris | c2 | -122.77 | 38.24 |
london | c3 | -123.77 | 39.24 |
我该怎么做?
试试 groupby
和 fillna
:
df = df.fillna(df.groupby(["lat", "long"]).transform("first"))
>>> df
city_id city_name lat long
0 c1 new york -121.77 37.24
1 c2 paris -122.77 38.24
2 c3 london -123.77 39.24
3 c1 new york -121.77 37.24
4 c2 paris -122.77 38.24
5 c3 london -123.77 39.24