使用 Seaborn 和 Pandas Dataframe 难以绘制拆分的 Violinplot

Difficulty plotting a split Violinplot using Seaborn and a Pandas Dataframe

我的数据框充满了来自模型的可能性,我正在使用它来识别一组图像上的兴趣点。行对应于图像,列对应于标签。标签有“左”和“右”版本。我想使用 split=True 关键字在同一个小提琴图上显示 L 和 R 侧。

我已经为标签“LH1”和“RH1”创建了单独的小提琴图,如下所示:

但我正在尝试用 5 把小提琴制作一个情节,左右分开。就像来自 seaborn 的这个例子:

Seaborn 需要一个 hue 参数,我想在我的例子中它是分类信息“左”或“右”。因此,我 restructured/reshaped 我的数据框删除了标签中的“L”或“R”前缀,并将信息添加为“手性”列下的类别。这大约是我目前拥有的:

  df = pd.DataFrame.from_dict(
        {'H1': {0: 0.55, 1: 0.56, 2: 0.46, 3: 0.93, 4: 0.74, 5: 0.35, 6: 0.75, 7: 0.86, 8: 0.81, 9: 0.88},
         'H2': {0: 0.5, 1: 0.55, 2: 0.61, 3: 0.82, 4: 0.51, 5: 0.35, 6: 0.58, 7: 0.66, 8: 0.93, 9: 0.86},
         'H3': {0: 0.42, 1: 0.51, 2: 0.86, 3: 0.59, 4: 0.46, 5: 0.71, 6: 0.58, 7: 0.72, 8: 0.53, 9: 0.92},
         'H4': {0: 0.89, 1: 0.87, 2: 0.04, 3: 0.64, 4: 0.44, 5: 0.05, 6: 0.33, 7: 0.93, 8: 0.08, 9: 0.9},
         'H5': {0: 0.92, 1: 0.75, 2: 0.13, 3: 0.85, 4: 0.51, 5: 0.15, 6: 0.38, 7: 0.92, 8: 0.36, 9: 0.76},
         'chirality': {0: 'Left', 1: 'Left', 2: 'Left', 3: 'Left', 4: 'Left', 5: 'Right', 6: 'Right', 7: 'Right', 8: 'Right', 9: 'Right'},
         'image': {0: 'image_0', 1: 'image_1', 2: 'image_2', 3: 'image_3', 4: 'image_4', 5: 'image_0', 6: 'image_1', 7: 'image_2', 8: 'image_3', 9: 'image_4'}})


     H1    H2    H3    H4    H5 chirality    image
0  0.55  0.50  0.42  0.89  0.92      Left  image_0
1  0.56  0.55  0.51  0.87  0.75      Left  image_1
2  0.46  0.61  0.86  0.04  0.13      Left  image_2
3  0.93  0.82  0.59  0.64  0.85      Left  image_3
4  0.74  0.51  0.46  0.44  0.51      Left  image_4
5  0.35  0.35  0.71  0.05  0.15     Right  image_0
6  0.75  0.58  0.58  0.33  0.38     Right  image_1
7  0.86  0.66  0.72  0.93  0.92     Right  image_2
8  0.81  0.93  0.53  0.08  0.36     Right  image_3
9  0.88  0.86  0.92  0.90  0.76     Right  image_4



# This is what I WANT to do.. but seaborn requires and x and y parameter. 
fig, ax = plt.subplots(figsize=(15,6))
sns.set_theme(style="whitegrid")
ax = sns.violinplot(ax=ax, 
                    data=df, 
                    hue='chirality', 
                    split=True)

我尝试了很多不同的方法,但我似乎做不到。在上面的尝试中,我得到 ValueError: Cannot use 'hue' without 'x' and 'y' 我什至不知道我可以将它们设置为什么,尽管尝试了各种方法并进一步重塑了我的数据框。我想我想要 x 作为标签列表,y 作为可能性值和色调来指定 L/R。 感谢您的帮助!

Seaborn 使用 "long form", which can be accomplished e.g. via pandas' melt() 中的数据框最简单。结果变量和值可用于 x=y=.

import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd

df = pd.DataFrame.from_dict(
        {'H1': {0: 0.55, 1: 0.56, 2: 0.46, 3: 0.93, 4: 0.74, 5: 0.35, 6: 0.75, 7: 0.86, 8: 0.81, 9: 0.88},
         'H2': {0: 0.5, 1: 0.55, 2: 0.61, 3: 0.82, 4: 0.51, 5: 0.35, 6: 0.58, 7: 0.66, 8: 0.93, 9: 0.86},
         'H3': {0: 0.42, 1: 0.51, 2: 0.86, 3: 0.59, 4: 0.46, 5: 0.71, 6: 0.58, 7: 0.72, 8: 0.53, 9: 0.92},
         'H4': {0: 0.89, 1: 0.87, 2: 0.04, 3: 0.64, 4: 0.44, 5: 0.05, 6: 0.33, 7: 0.93, 8: 0.08, 9: 0.9},
         'H5': {0: 0.92, 1: 0.75, 2: 0.13, 3: 0.85, 4: 0.51, 5: 0.15, 6: 0.38, 7: 0.92, 8: 0.36, 9: 0.76},
         'chirality': {0: 'Left', 1: 'Left', 2: 'Left', 3: 'Left', 4: 'Left', 5: 'Right', 6: 'Right', 7: 'Right', 8: 'Right', 9: 'Right'},
         'image': {0: 'image_0', 1: 'image_1', 2: 'image_2', 3: 'image_3', 4: 'image_4', 5: 'image_0', 6: 'image_1', 7: 'image_2', 8: 'image_3', 9: 'image_4'}})

df_long = df.melt(id_vars=['chirality', 'image'], value_vars=['H1', 'H2', 'H3', 'H4', 'H5'],
                  var_name='H', value_name='value')

fig, ax = plt.subplots(figsize=(15, 6))
sns.set_theme(style="whitegrid")
sns.violinplot(ax=ax,
               data=df_long,
               x='H',
               y='value',
               hue='chirality',
               palette='summer',
               split=True)
ax.set(xlabel='', ylabel='')
sns.despine()
plt.tight_layout()
plt.show()

这是另一个示例,使用鸢尾花数据集,将其转换为长格式以显示两个物种的每种组合的拆分小提琴图:

import matplotlib.pyplot as plt
import seaborn as sns

iris = sns.load_dataset('iris')
iris_long = iris.melt(id_vars='species')
iris_long['variable'] = iris_long['variable'].apply(lambda s: s.replace('_', '\n'))
sns.set_style('darkgrid')
fig, axs = plt.subplots(ncols=3, figsize=(12, 4), sharey=True)
palette = {'setosa': 'crimson', 'versicolor': 'cornflowerblue', 'virginica': 'limegreen'}
for excluded, ax in zip(iris.species.unique(), axs):
    sns.violinplot(ax=ax, data=iris_long[iris_long['species'] != excluded],
                   x='variable', y='value', hue='species', palette=palette, split=True)
    ax.set(xlabel='', ylabel='')
plt.tight_layout()
plt.show()