总结一下'remaining'饼图更易读
Sum up 'remaining' that pie chart is more readable
我有问题。我想绘制饼图。但不幸的是只有三个 id
是可读的。另一个只有一小部分。是否有一个选项可以汇总所有小的,然后用名称 remaining
汇总?是否也有自动选择的选项?因为我可以说限制是 100、1000 等,但是是否有自动求和的选项。我在我的真实数据框中使用 df.value_counts()
数据框
id count
0 1 4521
1 2 1247
2 3 962
3 4 12
4 5 6
5 6 5
6 7 4
代码
import pandas as pd
import seaborn as sns
d = {'id': [1, 2, 3, 4, 5, 6, 7],
'count': [4521, 1247, 962, 12, 6, 5, 4],
}
df = pd.DataFrame(data=d)
print(df)
colors = sns.color_palette('GnBu_r')
plt.pie(df['count'],
labels = df['id'], colors = colors)
plt.show()
您可以将数据中的行与条件结合起来:如果 'percentage'
小于阈值,则对这些行求和:
threshold = 0.1
df['percentage'] = df['count']/df['count'].sum()
remaining = df.loc[df['percentage'] < threshold].sum(axis = 0)
remaining.loc['id'] = 'remaining'
df = df[df['percentage'] >= threshold]
df = df.append(remaining, ignore_index = True)
df['count'] = df['count'].astype(int)
所以你得到:
id count percentage
0 1 4521 0.669084
1 2 1247 0.184549
2 3 962 0.142371
3 remaining 27 0.003996
完整代码
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
threshold = 0.1
d = {'id': [1, 2, 3, 4, 5, 6, 7],
'count': [4521, 1247, 962, 12, 6, 5, 4]}
df = pd.DataFrame(data = d)
df['percentage'] = df['count']/df['count'].sum()
remaining = df.loc[df['percentage'] < threshold].sum(axis = 0)
remaining.loc['id'] = 'remaining'
df = df[df['percentage'] >= threshold]
df = df.append(remaining, ignore_index = True)
df['count'] = df['count'].astype(int)
colors = sns.color_palette('GnBu_r')
plt.pie(df['count'],
labels = df['id'], colors = colors)
plt.show()
情节
我有问题。我想绘制饼图。但不幸的是只有三个 id
是可读的。另一个只有一小部分。是否有一个选项可以汇总所有小的,然后用名称 remaining
汇总?是否也有自动选择的选项?因为我可以说限制是 100、1000 等,但是是否有自动求和的选项。我在我的真实数据框中使用 df.value_counts()
数据框
id count
0 1 4521
1 2 1247
2 3 962
3 4 12
4 5 6
5 6 5
6 7 4
代码
import pandas as pd
import seaborn as sns
d = {'id': [1, 2, 3, 4, 5, 6, 7],
'count': [4521, 1247, 962, 12, 6, 5, 4],
}
df = pd.DataFrame(data=d)
print(df)
colors = sns.color_palette('GnBu_r')
plt.pie(df['count'],
labels = df['id'], colors = colors)
plt.show()
您可以将数据中的行与条件结合起来:如果 'percentage'
小于阈值,则对这些行求和:
threshold = 0.1
df['percentage'] = df['count']/df['count'].sum()
remaining = df.loc[df['percentage'] < threshold].sum(axis = 0)
remaining.loc['id'] = 'remaining'
df = df[df['percentage'] >= threshold]
df = df.append(remaining, ignore_index = True)
df['count'] = df['count'].astype(int)
所以你得到:
id count percentage
0 1 4521 0.669084
1 2 1247 0.184549
2 3 962 0.142371
3 remaining 27 0.003996
完整代码
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
threshold = 0.1
d = {'id': [1, 2, 3, 4, 5, 6, 7],
'count': [4521, 1247, 962, 12, 6, 5, 4]}
df = pd.DataFrame(data = d)
df['percentage'] = df['count']/df['count'].sum()
remaining = df.loc[df['percentage'] < threshold].sum(axis = 0)
remaining.loc['id'] = 'remaining'
df = df[df['percentage'] >= threshold]
df = df.append(remaining, ignore_index = True)
df['count'] = df['count'].astype(int)
colors = sns.color_palette('GnBu_r')
plt.pie(df['count'],
labels = df['id'], colors = colors)
plt.show()