总结一下'remaining'饼图更易读

Sum up 'remaining' that pie chart is more readable

我有问题。我想绘制饼图。但不幸的是只有三个 id 是可读的。另一个只有一小部分。是否有一个选项可以汇总所有小的,然后用名称 remaining 汇总?是否也有自动选择的选项?因为我可以说限制是 100、1000 等,但是是否有自动求和的选项。我在我的真实数据框中使用 df.value_counts()

数据框

   id  count
0   1   4521
1   2   1247
2   3    962
3   4     12
4   5      6
5   6      5
6   7      4

代码

import pandas as pd
import seaborn as sns
d = {'id': [1, 2, 3, 4, 5, 6, 7],
     'count': [4521, 1247, 962, 12, 6, 5, 4],
    }
df = pd.DataFrame(data=d)
print(df)

colors = sns.color_palette('GnBu_r')
plt.pie(df['count'], 
        labels = df['id'], colors = colors)
plt.show()

您可以将数据中的行与条件结合起来:如果 'percentage' 小于阈值,则对这些行求和:

threshold = 0.1
df['percentage'] = df['count']/df['count'].sum()

remaining = df.loc[df['percentage'] < threshold].sum(axis = 0)
remaining.loc['id'] = 'remaining'
df = df[df['percentage'] >= threshold]

df = df.append(remaining, ignore_index = True)
df['count'] = df['count'].astype(int)

所以你得到:

          id  count  percentage
0          1   4521    0.669084
1          2   1247    0.184549
2          3    962    0.142371
3  remaining     27    0.003996

完整代码

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

threshold = 0.1

d = {'id': [1, 2, 3, 4, 5, 6, 7],
     'count': [4521, 1247, 962, 12, 6, 5, 4]}
df = pd.DataFrame(data = d)
df['percentage'] = df['count']/df['count'].sum()

remaining = df.loc[df['percentage'] < threshold].sum(axis = 0)
remaining.loc['id'] = 'remaining'
df = df[df['percentage'] >= threshold]

df = df.append(remaining, ignore_index = True)
df['count'] = df['count'].astype(int)

colors = sns.color_palette('GnBu_r')
plt.pie(df['count'],
        labels = df['id'], colors = colors)
plt.show()

情节