使用颜色编码向图表添加误差线

Add error bars to chart with Color encoding

我正在尝试使用 mark.errorbar() 向图表添加误差线,但是当我添加颜色编码时,误差线消失了。

这是基本图表的代码:

base = alt.Chart(chart_df).mark_line().encode(
x = alt.X('Session:N'),
y = alt.Y('CR Lever'),
color = alt.Color('Phenotype:N', scale=alt.Scale(domain=['GT','IN','ST'],
                                                 range = ['red', 'blue', 'green']))

但是当我尝试使用以下代码获取误差线图表时,什么也没有出现。

alt.Chart(chart_df).mark_errorbar(extent='stderr').encode(
x = alt.X('Session:N'),
y = alt.Y('CR Lever'),
color = alt.Color('Phenotype:N', scale=alt.Scale(domain=['GT','IN','ST'],
                                                 range = ['red', 'blue', 'green']))

只需删除颜色编码即可显示误差线,但有什么方法可以让它们出现在分色图上吗?

完整的最小可重现示例:

import pandas as pd
import numpy as np
import random
import altair as alt

# Construct DF
base_df = pd.DataFrame()
id = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15]
c = ['ST', 'IN', 'GT']
x = [1,2,3,4,5]
y = []

base_df['id'] = np.repeat(id, 5)
base_df['c'] = np.repeat(c, 25)
base_df['x'] = np.tile(x, 15)

for group in base_df['c']:
  if group == 'ST':
    y.append(random.randint(90,130))
  if group == 'IN':
    y.append(random.randint(30,80))
  if group == 'GT':
    y.append(random.randint(1, 10))

base_df['y'] = y

chart_df = base_df[['c','x','y']].groupby(['x','c']).agg('mean').reset_index()

#Create the base chart to which I want to add error bars
base = alt.Chart(chart_df).mark_line().encode(
  x = alt.X('x:N'),
  y = alt.Y('y'),
  color = alt.Color('c:N', scale=alt.Scale(domain=['GT','IN','ST'],
                                                  range = ['red', 'blue', 'green']))
)


#Error bar chart without grouping with color encoding. This works fine
bars_no_group = alt.Chart(chart_df).mark_errorbar(extent='stderr').encode(
  x = alt.X('x:N'),
  y = alt.Y('y')
)

#Error bar chart with grouping with color encoding. This does not work
bars_group = alt.Chart(chart_df).mark_errorbar(extent='stderr').encode(
  x = alt.X('x:N'),
  y = alt.Y('y'),
  color = alt.Color('c:N', scale=alt.Scale(domain=['GT','IN','ST'],
                                                  range = ['red', 'blue', 'green']))
)

它不起作用的原因是没有足够的数据点来在每个颜色组中创建误差条。如果您通过 altair 聚合数据而不是在 pandas 中执行 groupby,它会按预期工作,因为现在每种颜色和 x 位置都有多个数据点:

# Comment out this groupby operation to make sure there are enough data points in each color group
chart_df = base_df[['c','x','y']] #.groupby(['x','c']).agg('mean').reset_index()

#Create the base chart to which I want to add error bars
lines = alt.Chart(chart_df).mark_line().encode(
  x = alt.X('x:N'),
  y = alt.Y('mean(y)'),
  color = alt.Color('c:N', scale=alt.Scale(domain=['GT','IN','ST'],
                                                  range = ['red', 'blue', 'green']))
)

errorbars = alt.Chart(chart_df).mark_errorbar(extent='stderr').encode(
  x = alt.X('x:N'),
  y = alt.Y('y'),
  color = alt.Color(
      'c:N',
      scale=alt.Scale(
          domain=['GT','IN','ST'],
          range = ['red', 'blue', 'green']
      )
  )
)

lines + errorbars