涉及聚合字段的条件的 Altair 语法

Altair syntax for condition involving aggregate field

我正在尝试创建一个涉及聚合字段的条件。对于此示例数据集

df=pd.DataFrame([['game1','player1',2,1],['game1','player2',3,4],['game1','player3',2,2]
                ,['game2','player1',0,3],['game2','player2',4,4],['game2','player3',3,3]]
                ,columns=['game','player','score1','score2']) 
color={'condition':[{"value":"green","test":"datum.score2 > datum.score1"}
                   ,{"value":"yellow","test":"datum.score2 == datum.score1"}
                   ,{"value":"red","test":"datum.score2 < datum.score1"}]}
alt.Chart(df).mark_point().encode(x='score2',y='player',color=color)

我得到这张图表:

但是如果我想要一个图表只显示每个玩家的平均值,我无法找到适合该条件的语法。

alt.Chart(df).mark_point().encode(x='mean(score2)',y='player',color=color)

我试过了:

"test":mean(datum.score2) > mean(datum.score1)"

"test":"datum.mean(score2) > datum.mean(score1)"

None 他们成功了。我在文档中找不到任何语法说明。

mean() 是 Altair 中的 shorthand,可用于编码字段和转换,但不能直接用于条件。要在条件中使用平均值,您需要通过 transform_aggregate 在单独的步骤中为平均值创建新列(这里我们使用 transform_joinaggregate 因为您想在数据框中绘制原始值并且不是聚合值):

color={
    'condition': [
        {"value":"green", "test": "datum.mean_score2 > datum.mean_score1"},
        {"value":"yellow", "test": "datum.mean_score2 == datum.mean_score1"},
        {"value":"red", "test": "datum.mean_score2 < datum.mean_score1"}
    ]
}

alt.Chart(df).mark_point().encode(
    x='score2',
    y='player',
    color=color
).transform_joinaggregate(
    mean_score1='mean(score1)',
    mean_score2='mean(score2)',
    groupby=['player']
)

如果你想绘制平均值,它看起来像这样:

alt.Chart(df).mark_point().encode(
    x='mean_score2:Q',
    y='player',
    color=color
).transform_aggregate(
    mean_score1='mean(score1)',
    mean_score2='mean(score2)',
    groupby=['player']
)