添加中位数作为文本以绘制地表达 px.box 个方面

Question

如何为 3 个目标类别和 4 个多面子图的每一个添加中值？我想在子图的底部或框的右侧添加值。

from sklearn.datasets import load_iris
import pandas as pd
import plotly.express as px

data = load_iris(as_frame=True)
df = data.data.assign(target=data.target)

melted_df = df.melt(id_vars='target')

px.box(melted_df, x='target', y='value', facet_col='variable', height=500)

Answer 1

由于箱线图的基本功能没有直接显示中位数的功能，所以我使用注释来处理这个问题。为中位数创建一个数据框。创建用于提取的图形名称列表并将它们用作条件。为要在循环中使用的每个子图创建轴名称列表。 ax=40 的文本位置无效，因为每个 x 轴的显示位置不同。这是一个未知的原因。所以我把文字的颜色改成即使重叠也能识别的颜色。这是给你调整的。

from sklearn.datasets import load_iris
import pandas as pd
import plotly.express as px

data = load_iris(as_frame=True)
df = data.data.assign(target=data.target)

melted_df = df.melt(id_vars='target')
# median data
median_df = melted_df.groupby(['variable','target'])['value'].median().to_frame('median').reset_index()

fig = px.box(melted_df, x='target', y='value', facet_col='variable', height=500)

graph_name = [fig.layout['annotations'][i]['text'][9:] for i in range(4)]
xref = sum([['x1']*3,['x2']*3,['x3']*3,['x4']*3],[])
yref = sum([['y1']*3,['y2']*3,['y3']*3,['y4']*3],[])

i = 0
for name in graph_name:
    dfm = median_df.query('variable == @name')
    for row in dfm.itertuples(name=None):
        fig.add_annotation(
            dict(x=row[2],
                 y=row[3],
                 xref=xref[i],
                 yref=yref[i],
                 text=str(row[3]),
                 font=dict(color='red'),
                 showarrow=False,
                 ax=40))
        i += 1

fig.show()

Answer 2

发布@r-beginners 解决方案的重构版本。再次感谢！

from sklearn.datasets import load_iris
import pandas as pd
import plotly.express as px

data = load_iris(as_frame=True)
df = data.data.assign(target=data.target)

melted_df = df.melt(id_vars='target')
median_df = melted_df.groupby(['variable','target'])['value'].median().to_frame('median')


fig = px.box(melted_df, x='target', y='value', facet_col='variable', height=500)

for i, annotation in enumerate(fig.layout['annotations']):
    
    variable = annotation['text'].replace('variable=', '')
    
    for (target_category, value) in median_df.loc[(variable,)].itertuples(name=None): 
        fig.add_annotation(
            dict(x=target_category,
                 y=value,
                 xref='x' + str(i + 1),
                 yref='y' + str(i + 1),
                 text=str(value),
                 font=dict(color='red'),
                 showarrow=False,
                 ax=40))

    
fig.show()

添加中位数作为文本以绘制地表达 px.box 个方面

adding the median as text to plotly express px.box facets

python

pandas

plotly

plotly-express