连接散景堆叠条形图以可视化变化

Concatenate Bokeh Stacked Bar Plots to visualise changes

我有两个数据框

df1 = pd.DataFrame([['1','1','1','2','2','2','3','3','3'],['1.2','3.5','44','77','3.4','24','11','12','13'], ['30312', '20021', '23423', '23424', '45646', '34535', '35345', '34535', '76786']]).T

df.columns = [['QID','score', 'DocID']]

df2 = pd.DataFrame([['1','1','1','2','2','2','3','3','3'],['21.2','13.5','12','77.6','3.9','29','17','41','32'], ['30312', '20021', '23423', '23424', '45646', '34535', '35345', '34535', '76786']]).T

df.columns = [['QID','score', 'DocID']]

目前,我在两个不同的图表中使用 df1 和 df2 中的散景绘制分数

df1_BarDocID = Bar(df1, 'QID', values='score', stack = 'DocID', title="D1: QID Stacked by DocID on Score")

D2_BarDocID = Bar(df2, 'QID', values='score', stack = 'DocID', title="D1: QID Stacked by DocID on Score")

grid = gridplot([[D1_BarDocID, D2_BarDocID]])
show(grid)

但是,我想在一个图中绘制两个数据帧,其方式是为单个 QID 并排绘制 Df1 和 Df2 的输出。所以我可以使用散景可视化两个 DataFrame 之间的得分差异。

df1 和 df2 图,使用散景

这里是一个完整的例子,使用较新的 vbar_stack 和稳定的 bokeh.plotting API。它可能会变得更简单,但我的 Pandas 知识有限:

import pandas as pd

from bokeh.core.properties import value
from bokeh.io import output_file
from bokeh.models import FactorRange
from bokeh.palettes import Spectral8
from bokeh.plotting import figure, show

df1 = pd.DataFrame([['1','1','1','2','2','2','3','3','3'],[1.2, 3.5, 44, 77, 3.4, 24, 11, 12, 13], ['30312', '20021', '23423', '23424', '45646', '34535', '35345', '34535', '76786']]).T
df1.columns = ['QID','score', 'DocID']
df1 = df1.pivot(index='QID', columns='DocID', values='score').fillna(0)
df1.index = [(x, 'df1') for x in df1.index]

df2 = pd.DataFrame([['1','1','1','2','2','2','3','3','3'],[21.2, 13.5, 12, 77.6, 3.9, 29, 17, 41, 32], ['30312', '20021', '23423', '23424', '45646', '34535', '35345', '34535', '76786']]).T
df2.columns = ['QID','score', 'DocID']
df2 = df2.pivot(index='QID', columns='DocID', values='score').fillna(0)
df2.index = [(x,'df2') for x in df2.index]

df = pd.concat([df1, df2])

p = figure(plot_width=800, x_range=FactorRange(*df.index))
p.vbar_stack(df.columns, x='index', width=0.8, fill_color=Spectral8, 
             line_color=None, source=df, legend=[value(x) for x in df.columns]) 

p.legend.location = "top_left"

output_file('foo.html')
show(p)

生产: