使用 Seaborn 和 Pandas Dataframe 难以绘制拆分的 Violinplot
Difficulty plotting a split Violinplot using Seaborn and a Pandas Dataframe
我的数据框充满了来自模型的可能性,我正在使用它来识别一组图像上的兴趣点。行对应于图像,列对应于标签。标签有“左”和“右”版本。我想使用 split=True
关键字在同一个小提琴图上显示 L 和 R 侧。
我已经为标签“LH1”和“RH1”创建了单独的小提琴图,如下所示:
但我正在尝试用 5 把小提琴制作一个情节,左右分开。就像来自 seaborn 的这个例子:
Seaborn 需要一个 hue
参数,我想在我的例子中它是分类信息“左”或“右”。因此,我 restructured/reshaped 我的数据框删除了标签中的“L”或“R”前缀,并将信息添加为“手性”列下的类别。这大约是我目前拥有的:
df = pd.DataFrame.from_dict(
{'H1': {0: 0.55, 1: 0.56, 2: 0.46, 3: 0.93, 4: 0.74, 5: 0.35, 6: 0.75, 7: 0.86, 8: 0.81, 9: 0.88},
'H2': {0: 0.5, 1: 0.55, 2: 0.61, 3: 0.82, 4: 0.51, 5: 0.35, 6: 0.58, 7: 0.66, 8: 0.93, 9: 0.86},
'H3': {0: 0.42, 1: 0.51, 2: 0.86, 3: 0.59, 4: 0.46, 5: 0.71, 6: 0.58, 7: 0.72, 8: 0.53, 9: 0.92},
'H4': {0: 0.89, 1: 0.87, 2: 0.04, 3: 0.64, 4: 0.44, 5: 0.05, 6: 0.33, 7: 0.93, 8: 0.08, 9: 0.9},
'H5': {0: 0.92, 1: 0.75, 2: 0.13, 3: 0.85, 4: 0.51, 5: 0.15, 6: 0.38, 7: 0.92, 8: 0.36, 9: 0.76},
'chirality': {0: 'Left', 1: 'Left', 2: 'Left', 3: 'Left', 4: 'Left', 5: 'Right', 6: 'Right', 7: 'Right', 8: 'Right', 9: 'Right'},
'image': {0: 'image_0', 1: 'image_1', 2: 'image_2', 3: 'image_3', 4: 'image_4', 5: 'image_0', 6: 'image_1', 7: 'image_2', 8: 'image_3', 9: 'image_4'}})
H1 H2 H3 H4 H5 chirality image
0 0.55 0.50 0.42 0.89 0.92 Left image_0
1 0.56 0.55 0.51 0.87 0.75 Left image_1
2 0.46 0.61 0.86 0.04 0.13 Left image_2
3 0.93 0.82 0.59 0.64 0.85 Left image_3
4 0.74 0.51 0.46 0.44 0.51 Left image_4
5 0.35 0.35 0.71 0.05 0.15 Right image_0
6 0.75 0.58 0.58 0.33 0.38 Right image_1
7 0.86 0.66 0.72 0.93 0.92 Right image_2
8 0.81 0.93 0.53 0.08 0.36 Right image_3
9 0.88 0.86 0.92 0.90 0.76 Right image_4
# This is what I WANT to do.. but seaborn requires and x and y parameter.
fig, ax = plt.subplots(figsize=(15,6))
sns.set_theme(style="whitegrid")
ax = sns.violinplot(ax=ax,
data=df,
hue='chirality',
split=True)
我尝试了很多不同的方法,但我似乎做不到。在上面的尝试中,我得到 ValueError: Cannot use 'hue' without 'x' and 'y'
我什至不知道我可以将它们设置为什么,尽管尝试了各种方法并进一步重塑了我的数据框。我想我想要 x 作为标签列表,y 作为可能性值和色调来指定 L/R。
感谢您的帮助!
Seaborn 使用 "long form", which can be accomplished e.g. via pandas' melt()
中的数据框最简单。结果变量和值可用于 x=
和 y=
.
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
df = pd.DataFrame.from_dict(
{'H1': {0: 0.55, 1: 0.56, 2: 0.46, 3: 0.93, 4: 0.74, 5: 0.35, 6: 0.75, 7: 0.86, 8: 0.81, 9: 0.88},
'H2': {0: 0.5, 1: 0.55, 2: 0.61, 3: 0.82, 4: 0.51, 5: 0.35, 6: 0.58, 7: 0.66, 8: 0.93, 9: 0.86},
'H3': {0: 0.42, 1: 0.51, 2: 0.86, 3: 0.59, 4: 0.46, 5: 0.71, 6: 0.58, 7: 0.72, 8: 0.53, 9: 0.92},
'H4': {0: 0.89, 1: 0.87, 2: 0.04, 3: 0.64, 4: 0.44, 5: 0.05, 6: 0.33, 7: 0.93, 8: 0.08, 9: 0.9},
'H5': {0: 0.92, 1: 0.75, 2: 0.13, 3: 0.85, 4: 0.51, 5: 0.15, 6: 0.38, 7: 0.92, 8: 0.36, 9: 0.76},
'chirality': {0: 'Left', 1: 'Left', 2: 'Left', 3: 'Left', 4: 'Left', 5: 'Right', 6: 'Right', 7: 'Right', 8: 'Right', 9: 'Right'},
'image': {0: 'image_0', 1: 'image_1', 2: 'image_2', 3: 'image_3', 4: 'image_4', 5: 'image_0', 6: 'image_1', 7: 'image_2', 8: 'image_3', 9: 'image_4'}})
df_long = df.melt(id_vars=['chirality', 'image'], value_vars=['H1', 'H2', 'H3', 'H4', 'H5'],
var_name='H', value_name='value')
fig, ax = plt.subplots(figsize=(15, 6))
sns.set_theme(style="whitegrid")
sns.violinplot(ax=ax,
data=df_long,
x='H',
y='value',
hue='chirality',
palette='summer',
split=True)
ax.set(xlabel='', ylabel='')
sns.despine()
plt.tight_layout()
plt.show()
这是另一个示例,使用鸢尾花数据集,将其转换为长格式以显示两个物种的每种组合的拆分小提琴图:
import matplotlib.pyplot as plt
import seaborn as sns
iris = sns.load_dataset('iris')
iris_long = iris.melt(id_vars='species')
iris_long['variable'] = iris_long['variable'].apply(lambda s: s.replace('_', '\n'))
sns.set_style('darkgrid')
fig, axs = plt.subplots(ncols=3, figsize=(12, 4), sharey=True)
palette = {'setosa': 'crimson', 'versicolor': 'cornflowerblue', 'virginica': 'limegreen'}
for excluded, ax in zip(iris.species.unique(), axs):
sns.violinplot(ax=ax, data=iris_long[iris_long['species'] != excluded],
x='variable', y='value', hue='species', palette=palette, split=True)
ax.set(xlabel='', ylabel='')
plt.tight_layout()
plt.show()
我的数据框充满了来自模型的可能性,我正在使用它来识别一组图像上的兴趣点。行对应于图像,列对应于标签。标签有“左”和“右”版本。我想使用 split=True
关键字在同一个小提琴图上显示 L 和 R 侧。
我已经为标签“LH1”和“RH1”创建了单独的小提琴图,如下所示:
但我正在尝试用 5 把小提琴制作一个情节,左右分开。就像来自 seaborn 的这个例子:
Seaborn 需要一个 hue
参数,我想在我的例子中它是分类信息“左”或“右”。因此,我 restructured/reshaped 我的数据框删除了标签中的“L”或“R”前缀,并将信息添加为“手性”列下的类别。这大约是我目前拥有的:
df = pd.DataFrame.from_dict(
{'H1': {0: 0.55, 1: 0.56, 2: 0.46, 3: 0.93, 4: 0.74, 5: 0.35, 6: 0.75, 7: 0.86, 8: 0.81, 9: 0.88},
'H2': {0: 0.5, 1: 0.55, 2: 0.61, 3: 0.82, 4: 0.51, 5: 0.35, 6: 0.58, 7: 0.66, 8: 0.93, 9: 0.86},
'H3': {0: 0.42, 1: 0.51, 2: 0.86, 3: 0.59, 4: 0.46, 5: 0.71, 6: 0.58, 7: 0.72, 8: 0.53, 9: 0.92},
'H4': {0: 0.89, 1: 0.87, 2: 0.04, 3: 0.64, 4: 0.44, 5: 0.05, 6: 0.33, 7: 0.93, 8: 0.08, 9: 0.9},
'H5': {0: 0.92, 1: 0.75, 2: 0.13, 3: 0.85, 4: 0.51, 5: 0.15, 6: 0.38, 7: 0.92, 8: 0.36, 9: 0.76},
'chirality': {0: 'Left', 1: 'Left', 2: 'Left', 3: 'Left', 4: 'Left', 5: 'Right', 6: 'Right', 7: 'Right', 8: 'Right', 9: 'Right'},
'image': {0: 'image_0', 1: 'image_1', 2: 'image_2', 3: 'image_3', 4: 'image_4', 5: 'image_0', 6: 'image_1', 7: 'image_2', 8: 'image_3', 9: 'image_4'}})
H1 H2 H3 H4 H5 chirality image
0 0.55 0.50 0.42 0.89 0.92 Left image_0
1 0.56 0.55 0.51 0.87 0.75 Left image_1
2 0.46 0.61 0.86 0.04 0.13 Left image_2
3 0.93 0.82 0.59 0.64 0.85 Left image_3
4 0.74 0.51 0.46 0.44 0.51 Left image_4
5 0.35 0.35 0.71 0.05 0.15 Right image_0
6 0.75 0.58 0.58 0.33 0.38 Right image_1
7 0.86 0.66 0.72 0.93 0.92 Right image_2
8 0.81 0.93 0.53 0.08 0.36 Right image_3
9 0.88 0.86 0.92 0.90 0.76 Right image_4
# This is what I WANT to do.. but seaborn requires and x and y parameter.
fig, ax = plt.subplots(figsize=(15,6))
sns.set_theme(style="whitegrid")
ax = sns.violinplot(ax=ax,
data=df,
hue='chirality',
split=True)
我尝试了很多不同的方法,但我似乎做不到。在上面的尝试中,我得到 ValueError: Cannot use 'hue' without 'x' and 'y'
我什至不知道我可以将它们设置为什么,尽管尝试了各种方法并进一步重塑了我的数据框。我想我想要 x 作为标签列表,y 作为可能性值和色调来指定 L/R。
感谢您的帮助!
Seaborn 使用 "long form", which can be accomplished e.g. via pandas' melt()
中的数据框最简单。结果变量和值可用于 x=
和 y=
.
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
df = pd.DataFrame.from_dict(
{'H1': {0: 0.55, 1: 0.56, 2: 0.46, 3: 0.93, 4: 0.74, 5: 0.35, 6: 0.75, 7: 0.86, 8: 0.81, 9: 0.88},
'H2': {0: 0.5, 1: 0.55, 2: 0.61, 3: 0.82, 4: 0.51, 5: 0.35, 6: 0.58, 7: 0.66, 8: 0.93, 9: 0.86},
'H3': {0: 0.42, 1: 0.51, 2: 0.86, 3: 0.59, 4: 0.46, 5: 0.71, 6: 0.58, 7: 0.72, 8: 0.53, 9: 0.92},
'H4': {0: 0.89, 1: 0.87, 2: 0.04, 3: 0.64, 4: 0.44, 5: 0.05, 6: 0.33, 7: 0.93, 8: 0.08, 9: 0.9},
'H5': {0: 0.92, 1: 0.75, 2: 0.13, 3: 0.85, 4: 0.51, 5: 0.15, 6: 0.38, 7: 0.92, 8: 0.36, 9: 0.76},
'chirality': {0: 'Left', 1: 'Left', 2: 'Left', 3: 'Left', 4: 'Left', 5: 'Right', 6: 'Right', 7: 'Right', 8: 'Right', 9: 'Right'},
'image': {0: 'image_0', 1: 'image_1', 2: 'image_2', 3: 'image_3', 4: 'image_4', 5: 'image_0', 6: 'image_1', 7: 'image_2', 8: 'image_3', 9: 'image_4'}})
df_long = df.melt(id_vars=['chirality', 'image'], value_vars=['H1', 'H2', 'H3', 'H4', 'H5'],
var_name='H', value_name='value')
fig, ax = plt.subplots(figsize=(15, 6))
sns.set_theme(style="whitegrid")
sns.violinplot(ax=ax,
data=df_long,
x='H',
y='value',
hue='chirality',
palette='summer',
split=True)
ax.set(xlabel='', ylabel='')
sns.despine()
plt.tight_layout()
plt.show()
这是另一个示例,使用鸢尾花数据集,将其转换为长格式以显示两个物种的每种组合的拆分小提琴图:
import matplotlib.pyplot as plt
import seaborn as sns
iris = sns.load_dataset('iris')
iris_long = iris.melt(id_vars='species')
iris_long['variable'] = iris_long['variable'].apply(lambda s: s.replace('_', '\n'))
sns.set_style('darkgrid')
fig, axs = plt.subplots(ncols=3, figsize=(12, 4), sharey=True)
palette = {'setosa': 'crimson', 'versicolor': 'cornflowerblue', 'virginica': 'limegreen'}
for excluded, ax in zip(iris.species.unique(), axs):
sns.violinplot(ax=ax, data=iris_long[iris_long['species'] != excluded],
x='variable', y='value', hue='species', palette=palette, split=True)
ax.set(xlabel='', ylabel='')
plt.tight_layout()
plt.show()