使用 Altair 在堆积面积图中堆积文本

Stacked text in a stacked area chart using Altair

我想知道是否可以在堆叠面积图的相应区域有文字标记。

我使用 median 聚合来获取单个 X 和 Y 轴值,否则它会显示整个图表边缘的文本。然而,这个集合并不是万无一失的,如果图表有点复杂,那么X轴位置可能不是文本显示的最佳区域。

这是我所知道的 -

X=[1,2,3,4,5,6,7,8,9,1,2,3,4,5,6,7,8,9,1,2,3,4,5,6,7,8,9,1,2,3,4,5,6,7,8,9]
V=[1,1,1,2,4,8,6,4,2,1,2,3,4,5,6,7,6,5,1,1,1,1,4,8,4,2,1,1,1,3,4,5,6,6,5,4]
key=['a', 'b', 'c', 'd']
K = [y for x in key for y in (x)*9]

demo = pd.DataFrame({'X': X, 'V': V, 'K': K})
a = alt.Chart(demo).mark_area().encode(
    x='X:O',
    y='V:Q',
    color='K:N'
)
t = alt.Chart(demo).mark_text().encode(
    x='median(X):O',
    y='median(V):Q',
    text=alt.Text('K:N',)
)
a+t

问题

并不是我不明白为什么我有这些问题,我确实有(Y 位置不是聚合为“堆叠”在彼此之上),但我不知道如何解决它或如果它现在是可行的。

我只想为文本构建一个单独的数据框并将其用作源。它比在 Altair 中进行各种转换更容易和可定制,如果在这种情况下甚至可能的话。

import pandas as pd
import altair as alt

X=[1,2,3,4,5,6,7,8,9,1,2,3,4,5,6,7,8,9,1,2,3,4,5,6,7,8,9,1,2,3,4,5,6,7,8,9]
V=[1,1,1,2,4,8,6,4,2,1,2,3,4,5,6,7,6,5,1,1,1,1,4,8,4,2,1,1,1,3,4,5,6,6,5,4]
key=['a', 'b', 'c', 'd']
K = [y for x in key for y in (x)*9]

demo = pd.DataFrame({'X': X, 'V': V, 'K': K})

# find X position where the sum of V's of K's is the maximum (this is at X=6)
idxmax = demo.groupby(["X"]).sum().idxmax()[0]
# find the cumulative sum of V's at position idxmax and
# take away some offset (4) so the labels go down a bit
# iloc[::-1] reverses the order because we want cumulative to start from the bottom (from 'd')
ypos = demo.groupby(["X", "K"]).sum().loc[idxmax].iloc[::-1].cumsum()["V"] - 4
# crate a new dataframe for the text, X column=idmax, Y column=cumulative ypos, K=key
demotext = pd.DataFrame([[idxmax, y, k] for y,k in zip(ypos.tolist(), key[::-1])],
                        columns=["X", "Y", "K"])


a = (alt.Chart(demo).mark_area()
        .encode(
                x='X:O',
                y='V:Q',
                color='K:N')
    )
t = (alt.Chart(demotext).mark_text()
        .encode(
                x='X:O',
                y='Y:Q',
                text='K:N'
))

a+t

输出

我已经意识到,可能无法以编程方式执行此操作,或者可能不值得这样做,因为涉及的复杂性(例如未对齐的峰)。 修改上面给出的数据稍微突出了这个问题-

import pandas as pd
import altair as alt

X=[1,2,3,4,5,6,7,8,9,1,2,3,4,5,6,7,8,9,1,2,3,4,5,6,7,8,9,1,2,3,4,5,6,7,8,9]
V=[1,1,1,2,4,8,6,4,2,1,2,3,4,5,6,7,6,5,1,1,1,1,4,8,4,2,1,1,5,9,5,3,1,1,1,1]
key=['a', 'b', 'c', 'd']
K = [y for x in key for y in (x)*9]

demo = pd.DataFrame({'X': X, 'V': V, 'K': K})

# get the x and y positions for max values of the graph
demo.groupby('K').max()

# crate a new dataframe for the text, X column=idmax, Y column=cumulative ypos, K=key
demotext = pd.DataFrame([[idxmax, y, k] for y,k in zip(ypos.tolist(), key[::-1])],
                        columns=["X", "Y", "K"])


a = (alt.Chart(demo).mark_area()
        .encode(
                x='X:O',
                y='V:Q',
                color='K:N')
    )
t = (alt.Chart(demotext).mark_text()
        .encode(
                x='X:O',
                y='Y:Q',
                text='K:N'
))
a+t

如果你有一个非常复杂的图表,我认为最好和最简单的方法是手动构建文本数据 -

a = (alt.Chart(demo).mark_area()
        .encode(
                x='X:O',
                y='V:Q',
                color='K:N')
    )
t = (alt.Chart(p).mark_text()
        .encode(
                x='X:O',
                y='csum:Q',
                text='K:N'
))
a+t

其中 p 是 -

    K   V   X   csum
0   a   8   6   18
1   b   7   7   7
2   c   8   6   4
3   d   9   3   4