如何为 pandas 列的每组创建一个子图
How to create a subplot for each group of a pandas column
在泰坦尼克号数据集中,我需要创建一个图表来显示所有 class 幸存乘客的百分比。它还应该有三个饼图。 class 1 名幸存者和未幸存者,class 2 名幸存者和未幸存者,class 3.
如何才能做到这一点?我已经尝试过这种类型的代码,但它产生了错误的值。
import pandas as pd
import seaborn as sns # for dataset
df_titanic = sns.load_dataset('titanic')
survived pclass sex age sibsp parch fare embarked class who adult_male deck embark_town alive alone
0 0 3 male 22.0 1 0 7.2500 S Third man True NaN Southampton no False
1 1 1 female 38.0 1 0 71.2833 C First woman False C Cherbourg yes False
2 1 3 female 26.0 0 0 7.9250 S Third woman False NaN Southampton yes True
c1s = len(df_titanic[(df_titanic.pclass==1) & (df_titanic.survived==1)].value_counts())
c2ns = len(df_titanic[(df_titanic.pclass==1) & (df_titanic.survived==0)].value_counts())
此代码生成真实值,但我需要在 3 个饼图中使用它
df_titanic.groupby(['pclass' ,'survived']).size().plot(kind='pie', autopct='%.2f')
class: 1,2,3 幸存: 0,1
代码:
labels = ["not survived", "survived"]
fig, axs = plt.subplots(1,3)
axs[0].pie(df_titanic[df_titanic["Pclass"] == 1].groupby(["Survived"]).size(), labels=labels, autopct='%1.1f%%')
axs[1].pie(df_titanic[df_titanic["Pclass"] == 2].groupby(["Survived"]).size(), labels=labels, autopct='%1.1f%%')
axs[2].pie(df_titanic[df_titanic["Pclass"] == 3].groupby(["Survived"]).size(), labels=labels, autopct='%1.1f%%')
plt.show()
结果:
- 使用 pandas 获取子图的正确方法是重塑数据框。
pandas.crosstab
用于塑造数据框
pandas.DataFrame.pivot
and pandas.DataFrame.pivot_table
是重塑数据以进行绘图的其他选项。
- 然后使用
pandas.DataFrame.plot
与 kind='pie'
和 subplots=True
绘图。
- 为格式化添加了额外的代码
- 旋转 pclass 标签
- 剧情标题
- 自定义图例,而不是每个子图的图例
- 为图例指定标签
- 为标签数量指定颜色
- 测试于
python 3.8.12
、pandas 1.3.4
、matplotlib 3.4.3
import seaborn as sns # for titanic data only
import pandas as pd
from matplotlib.patches import Patch # to create the colored squares for the legend
# load the dataframe
df = sns.load_dataset('titanic')
# reshaping the dataframe is the most important step
ct = pd.crosstab(df.survived, df.pclass)
# display(ct)
pclass 1 2 3
survived
0 80 97 372
1 136 87 119
# plot and add labels
colors = ['tab:blue', 'tab:orange'] # specify the colors so they can be used in the legend
labels = ["not survived", "survived"] # used for the legend
axes = ct.plot(kind='pie', autopct='%.1f%%', subplots=True, figsize=(12, 5),
legend=False, labels=['', ''], colors=colors)
# flatten the array of axes
axes = axes.flat
# extract the figure object
fig = axes[0].get_figure()
# rotate the pclass label
for ax in axes:
yl = ax.get_ylabel()
ax.set_ylabel(yl, rotation=0, fontsize=12)
# create the legend
legend_elements = [Patch(fc=c, label=l) for c, l in zip(colors, labels)]
fig.legend(handles=legend_elements, loc=9, fontsize=12, ncol=2, borderaxespad=0, bbox_to_anchor=(0., 0.8, 1, .102), frameon=False)
fig.tight_layout()
fig.suptitle('pclass survival', fontsize=15)
格式化图
未格式化的图
axes = ct.plot(kind='pie', autopct='%.1f%%', subplots=True, figsize=(12, 5), labels=["not survived", "survived"])
在泰坦尼克号数据集中,我需要创建一个图表来显示所有 class 幸存乘客的百分比。它还应该有三个饼图。 class 1 名幸存者和未幸存者,class 2 名幸存者和未幸存者,class 3.
如何才能做到这一点?我已经尝试过这种类型的代码,但它产生了错误的值。
import pandas as pd
import seaborn as sns # for dataset
df_titanic = sns.load_dataset('titanic')
survived pclass sex age sibsp parch fare embarked class who adult_male deck embark_town alive alone
0 0 3 male 22.0 1 0 7.2500 S Third man True NaN Southampton no False
1 1 1 female 38.0 1 0 71.2833 C First woman False C Cherbourg yes False
2 1 3 female 26.0 0 0 7.9250 S Third woman False NaN Southampton yes True
c1s = len(df_titanic[(df_titanic.pclass==1) & (df_titanic.survived==1)].value_counts())
c2ns = len(df_titanic[(df_titanic.pclass==1) & (df_titanic.survived==0)].value_counts())
此代码生成真实值,但我需要在 3 个饼图中使用它
df_titanic.groupby(['pclass' ,'survived']).size().plot(kind='pie', autopct='%.2f')
class: 1,2,3 幸存: 0,1
代码:
labels = ["not survived", "survived"]
fig, axs = plt.subplots(1,3)
axs[0].pie(df_titanic[df_titanic["Pclass"] == 1].groupby(["Survived"]).size(), labels=labels, autopct='%1.1f%%')
axs[1].pie(df_titanic[df_titanic["Pclass"] == 2].groupby(["Survived"]).size(), labels=labels, autopct='%1.1f%%')
axs[2].pie(df_titanic[df_titanic["Pclass"] == 3].groupby(["Survived"]).size(), labels=labels, autopct='%1.1f%%')
plt.show()
结果:
- 使用 pandas 获取子图的正确方法是重塑数据框。
pandas.crosstab
用于塑造数据框pandas.DataFrame.pivot
andpandas.DataFrame.pivot_table
是重塑数据以进行绘图的其他选项。
- 然后使用
pandas.DataFrame.plot
与kind='pie'
和subplots=True
绘图。
- 为格式化添加了额外的代码
- 旋转 pclass 标签
- 剧情标题
- 自定义图例,而不是每个子图的图例
- 为图例指定标签
- 为标签数量指定颜色
- 测试于
python 3.8.12
、pandas 1.3.4
、matplotlib 3.4.3
import seaborn as sns # for titanic data only
import pandas as pd
from matplotlib.patches import Patch # to create the colored squares for the legend
# load the dataframe
df = sns.load_dataset('titanic')
# reshaping the dataframe is the most important step
ct = pd.crosstab(df.survived, df.pclass)
# display(ct)
pclass 1 2 3
survived
0 80 97 372
1 136 87 119
# plot and add labels
colors = ['tab:blue', 'tab:orange'] # specify the colors so they can be used in the legend
labels = ["not survived", "survived"] # used for the legend
axes = ct.plot(kind='pie', autopct='%.1f%%', subplots=True, figsize=(12, 5),
legend=False, labels=['', ''], colors=colors)
# flatten the array of axes
axes = axes.flat
# extract the figure object
fig = axes[0].get_figure()
# rotate the pclass label
for ax in axes:
yl = ax.get_ylabel()
ax.set_ylabel(yl, rotation=0, fontsize=12)
# create the legend
legend_elements = [Patch(fc=c, label=l) for c, l in zip(colors, labels)]
fig.legend(handles=legend_elements, loc=9, fontsize=12, ncol=2, borderaxespad=0, bbox_to_anchor=(0., 0.8, 1, .102), frameon=False)
fig.tight_layout()
fig.suptitle('pclass survival', fontsize=15)
格式化图
未格式化的图
axes = ct.plot(kind='pie', autopct='%.1f%%', subplots=True, figsize=(12, 5), labels=["not survived", "survived"])