将列表从一个数据框扩展到另一个数据框 pandas
Expand a list from one dataframe to another dataframe pandas
我希望在以下方面得到帮助:
我在下面有一个给定的 df:
df
fruit State Count
apples CA 45
apples VT 54
apples MI 18
pears TX 20
pears AZ 89
plums NV 62
plums ID 10
我对每个州的每个水果进行了所有最高计数,并且能够得到如下结果:
df2
fruit State Count
apples VT 54
pears AZ 89
plums NV 62
现在我想弄清楚如何从 df2 中获取 'State' 值作为 df 中的新列,看起来像这样:
df
fruit State Count Main
apples CA 45 VT
apples VT 54 VT
apples MI 18 VT
pears TX 20 AZ
pears AZ 89 AZ
plums NV 62 NV
plums ID 10 NV
我可以用 .transform() 函数做类似的事情,但我只知道在调用 max 函数时如何做。我可以 运行 转换 df['list'] 吗?还是我在这里遗漏了其他东西?
使用 GroupBy.transform
by DataFrameGroupBy.idxmax
, but first need set_index
按列索引 State
:
df['new'] = df.set_index('State').groupby('fruit')['Count'].transform('idxmax').values
print (df)
fruit State Count new
0 apples CA 45 VT
1 apples VT 54 VT
2 apples MI 18 VT
3 pears TX 20 AZ
4 pears AZ 89 AZ
5 plums NV 62 NV
6 plums ID 10 NV
sort_values
, drop_duplicates
and set_index
for map
的另一种解决方案 Series
:
s = (df.sort_values('Count', ascending= False)
.drop_duplicates('fruit')
.set_index('fruit')['State'])
print (s)
fruit
pears AZ
plums NV
apples VT
Name: State, dtype: object
df['new'] = df['fruit'].map(s)
print (df)
fruit State Count new
0 apples CA 45 VT
1 apples VT 54 VT
2 apples MI 18 VT
3 pears TX 20 AZ
4 pears AZ 89 AZ
5 plums NV 62 NV
6 plums ID 10 NV
两步 :-) 没有 groupby
df2=df.sort_values('Count').drop_duplicates('fruit',keep='last')
df['new']=df.fruit.map(df2.set_index('fruit').State)
df
Out[240]:
fruit State Count new
0 apples CA 45 VT
1 apples VT 54 VT
2 apples MI 18 VT
3 pears TX 20 AZ
4 pears AZ 89 AZ
5 plums NV 62 NV
6 plums ID 10 NV
我希望在以下方面得到帮助:
我在下面有一个给定的 df:
df
fruit State Count
apples CA 45
apples VT 54
apples MI 18
pears TX 20
pears AZ 89
plums NV 62
plums ID 10
我对每个州的每个水果进行了所有最高计数,并且能够得到如下结果:
df2
fruit State Count
apples VT 54
pears AZ 89
plums NV 62
现在我想弄清楚如何从 df2 中获取 'State' 值作为 df 中的新列,看起来像这样:
df
fruit State Count Main
apples CA 45 VT
apples VT 54 VT
apples MI 18 VT
pears TX 20 AZ
pears AZ 89 AZ
plums NV 62 NV
plums ID 10 NV
我可以用 .transform() 函数做类似的事情,但我只知道在调用 max 函数时如何做。我可以 运行 转换 df['list'] 吗?还是我在这里遗漏了其他东西?
使用 GroupBy.transform
by DataFrameGroupBy.idxmax
, but first need set_index
按列索引 State
:
df['new'] = df.set_index('State').groupby('fruit')['Count'].transform('idxmax').values
print (df)
fruit State Count new
0 apples CA 45 VT
1 apples VT 54 VT
2 apples MI 18 VT
3 pears TX 20 AZ
4 pears AZ 89 AZ
5 plums NV 62 NV
6 plums ID 10 NV
sort_values
, drop_duplicates
and set_index
for map
的另一种解决方案 Series
:
s = (df.sort_values('Count', ascending= False)
.drop_duplicates('fruit')
.set_index('fruit')['State'])
print (s)
fruit
pears AZ
plums NV
apples VT
Name: State, dtype: object
df['new'] = df['fruit'].map(s)
print (df)
fruit State Count new
0 apples CA 45 VT
1 apples VT 54 VT
2 apples MI 18 VT
3 pears TX 20 AZ
4 pears AZ 89 AZ
5 plums NV 62 NV
6 plums ID 10 NV
两步 :-) 没有 groupby
df2=df.sort_values('Count').drop_duplicates('fruit',keep='last')
df['new']=df.fruit.map(df2.set_index('fruit').State)
df
Out[240]:
fruit State Count new
0 apples CA 45 VT
1 apples VT 54 VT
2 apples MI 18 VT
3 pears TX 20 AZ
4 pears AZ 89 AZ
5 plums NV 62 NV
6 plums ID 10 NV