按两个值排序并保留两个最大值
Sorting by two values and keeping the two highest values
我有问题。我想创建一个图表。我想展示买家 b
和卖家 s
之间的区别。但我只想展示前两个国家。有没有一个选项可以过滤 b
和 s
并获得最高的 2?
数据框
count country part
0 50 DE b
1 20 CN b
2 30 CN s
3 100 BG s
4 3 PL b
5 40 BG b
6 5 RU s
代码
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
d = {'count': [50, 20, 30, 100, 3, 40, 5],
'country': ['DE', 'CN', 'CN', 'BG', 'PL', 'BG', 'RU'],
'part': ['b', 'b', 's', 's', 'b', 'b', 's']
}
df = pd.DataFrame(data=d)
df_consignee_countries['party'] = 'consignee'
df_orders_countries['party'] = 'buyer'
df_party = pd.concat([df_consignee_countries, df_orders_countries], join="outer")
ax = sns.barplot(x="country", y="count", hue='part', data=df_party, palette='GnBu')
plt.legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)
for p in ax.patches:
ax.annotate(format(p.get_height(), '.1f'),
(p.get_x() + p.get_width() / 2., p.get_height()),
ha = 'center', va = 'center',
xytext = (0, 9),
textcoords = 'offset points')
我想要的
count country part
0 50 DE b
2 30 CN s
3 100 BG s
5 40 BG b
您首先需要对值进行排序,然后每组取 2 行:
>>> df.sort_values('count', ascending=False).groupby('part').head(2)
count country part
3 100 BG s
0 50 DE b
5 40 BG b
2 30 CN s
你要这个吗? :
df_b=df.loc[df['part']=='b',:].sort_values(by='count',ascending=False).head(n=2)
df_b=df_b.append(df.loc[df['part']=='s',:].sort_values(by='count',ascending=False).head(n=2))
df_b
我有问题。我想创建一个图表。我想展示买家 b
和卖家 s
之间的区别。但我只想展示前两个国家。有没有一个选项可以过滤 b
和 s
并获得最高的 2?
数据框
count country part
0 50 DE b
1 20 CN b
2 30 CN s
3 100 BG s
4 3 PL b
5 40 BG b
6 5 RU s
代码
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
d = {'count': [50, 20, 30, 100, 3, 40, 5],
'country': ['DE', 'CN', 'CN', 'BG', 'PL', 'BG', 'RU'],
'part': ['b', 'b', 's', 's', 'b', 'b', 's']
}
df = pd.DataFrame(data=d)
df_consignee_countries['party'] = 'consignee'
df_orders_countries['party'] = 'buyer'
df_party = pd.concat([df_consignee_countries, df_orders_countries], join="outer")
ax = sns.barplot(x="country", y="count", hue='part', data=df_party, palette='GnBu')
plt.legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)
for p in ax.patches:
ax.annotate(format(p.get_height(), '.1f'),
(p.get_x() + p.get_width() / 2., p.get_height()),
ha = 'center', va = 'center',
xytext = (0, 9),
textcoords = 'offset points')
我想要的
count country part
0 50 DE b
2 30 CN s
3 100 BG s
5 40 BG b
您首先需要对值进行排序,然后每组取 2 行:
>>> df.sort_values('count', ascending=False).groupby('part').head(2)
count country part
3 100 BG s
0 50 DE b
5 40 BG b
2 30 CN s
你要这个吗? :
df_b=df.loc[df['part']=='b',:].sort_values(by='count',ascending=False).head(n=2)
df_b=df_b.append(df.loc[df['part']=='s',:].sort_values(by='count',ascending=False).head(n=2))
df_b