使用 Altair 在堆积面积图中堆积文本
Stacked text in a stacked area chart using Altair
我想知道是否可以在堆叠面积图的相应区域有文字标记。
我使用 median
聚合来获取单个 X 和 Y 轴值,否则它会显示整个图表边缘的文本。然而,这个集合并不是万无一失的,如果图表有点复杂,那么X轴位置可能不是文本显示的最佳区域。
这是我所知道的 -
X=[1,2,3,4,5,6,7,8,9,1,2,3,4,5,6,7,8,9,1,2,3,4,5,6,7,8,9,1,2,3,4,5,6,7,8,9]
V=[1,1,1,2,4,8,6,4,2,1,2,3,4,5,6,7,6,5,1,1,1,1,4,8,4,2,1,1,1,3,4,5,6,6,5,4]
key=['a', 'b', 'c', 'd']
K = [y for x in key for y in (x)*9]
demo = pd.DataFrame({'X': X, 'V': V, 'K': K})
a = alt.Chart(demo).mark_area().encode(
x='X:O',
y='V:Q',
color='K:N'
)
t = alt.Chart(demo).mark_text().encode(
x='median(X):O',
y='median(V):Q',
text=alt.Text('K:N',)
)
a+t
问题
- 文本不在正确的区域。
- 文字顺序也错了
并不是我不明白为什么我有这些问题,我确实有(Y 位置不是聚合为“堆叠”在彼此之上),但我不知道如何解决它或如果它现在是可行的。
我只想为文本构建一个单独的数据框并将其用作源。它比在 Altair 中进行各种转换更容易和可定制,如果在这种情况下甚至可能的话。
import pandas as pd
import altair as alt
X=[1,2,3,4,5,6,7,8,9,1,2,3,4,5,6,7,8,9,1,2,3,4,5,6,7,8,9,1,2,3,4,5,6,7,8,9]
V=[1,1,1,2,4,8,6,4,2,1,2,3,4,5,6,7,6,5,1,1,1,1,4,8,4,2,1,1,1,3,4,5,6,6,5,4]
key=['a', 'b', 'c', 'd']
K = [y for x in key for y in (x)*9]
demo = pd.DataFrame({'X': X, 'V': V, 'K': K})
# find X position where the sum of V's of K's is the maximum (this is at X=6)
idxmax = demo.groupby(["X"]).sum().idxmax()[0]
# find the cumulative sum of V's at position idxmax and
# take away some offset (4) so the labels go down a bit
# iloc[::-1] reverses the order because we want cumulative to start from the bottom (from 'd')
ypos = demo.groupby(["X", "K"]).sum().loc[idxmax].iloc[::-1].cumsum()["V"] - 4
# crate a new dataframe for the text, X column=idmax, Y column=cumulative ypos, K=key
demotext = pd.DataFrame([[idxmax, y, k] for y,k in zip(ypos.tolist(), key[::-1])],
columns=["X", "Y", "K"])
a = (alt.Chart(demo).mark_area()
.encode(
x='X:O',
y='V:Q',
color='K:N')
)
t = (alt.Chart(demotext).mark_text()
.encode(
x='X:O',
y='Y:Q',
text='K:N'
))
a+t
输出
我已经意识到,可能无法以编程方式执行此操作,或者可能不值得这样做,因为涉及的复杂性(例如未对齐的峰)。
修改上面给出的数据稍微突出了这个问题-
import pandas as pd
import altair as alt
X=[1,2,3,4,5,6,7,8,9,1,2,3,4,5,6,7,8,9,1,2,3,4,5,6,7,8,9,1,2,3,4,5,6,7,8,9]
V=[1,1,1,2,4,8,6,4,2,1,2,3,4,5,6,7,6,5,1,1,1,1,4,8,4,2,1,1,5,9,5,3,1,1,1,1]
key=['a', 'b', 'c', 'd']
K = [y for x in key for y in (x)*9]
demo = pd.DataFrame({'X': X, 'V': V, 'K': K})
# get the x and y positions for max values of the graph
demo.groupby('K').max()
# crate a new dataframe for the text, X column=idmax, Y column=cumulative ypos, K=key
demotext = pd.DataFrame([[idxmax, y, k] for y,k in zip(ypos.tolist(), key[::-1])],
columns=["X", "Y", "K"])
a = (alt.Chart(demo).mark_area()
.encode(
x='X:O',
y='V:Q',
color='K:N')
)
t = (alt.Chart(demotext).mark_text()
.encode(
x='X:O',
y='Y:Q',
text='K:N'
))
a+t
如果你有一个非常复杂的图表,我认为最好和最简单的方法是手动构建文本数据 -
a = (alt.Chart(demo).mark_area()
.encode(
x='X:O',
y='V:Q',
color='K:N')
)
t = (alt.Chart(p).mark_text()
.encode(
x='X:O',
y='csum:Q',
text='K:N'
))
a+t
其中 p
是 -
K V X csum
0 a 8 6 18
1 b 7 7 7
2 c 8 6 4
3 d 9 3 4
我想知道是否可以在堆叠面积图的相应区域有文字标记。
我使用 median
聚合来获取单个 X 和 Y 轴值,否则它会显示整个图表边缘的文本。然而,这个集合并不是万无一失的,如果图表有点复杂,那么X轴位置可能不是文本显示的最佳区域。
这是我所知道的 -
X=[1,2,3,4,5,6,7,8,9,1,2,3,4,5,6,7,8,9,1,2,3,4,5,6,7,8,9,1,2,3,4,5,6,7,8,9]
V=[1,1,1,2,4,8,6,4,2,1,2,3,4,5,6,7,6,5,1,1,1,1,4,8,4,2,1,1,1,3,4,5,6,6,5,4]
key=['a', 'b', 'c', 'd']
K = [y for x in key for y in (x)*9]
demo = pd.DataFrame({'X': X, 'V': V, 'K': K})
a = alt.Chart(demo).mark_area().encode(
x='X:O',
y='V:Q',
color='K:N'
)
t = alt.Chart(demo).mark_text().encode(
x='median(X):O',
y='median(V):Q',
text=alt.Text('K:N',)
)
a+t
问题
- 文本不在正确的区域。
- 文字顺序也错了
并不是我不明白为什么我有这些问题,我确实有(Y 位置不是聚合为“堆叠”在彼此之上),但我不知道如何解决它或如果它现在是可行的。
我只想为文本构建一个单独的数据框并将其用作源。它比在 Altair 中进行各种转换更容易和可定制,如果在这种情况下甚至可能的话。
import pandas as pd
import altair as alt
X=[1,2,3,4,5,6,7,8,9,1,2,3,4,5,6,7,8,9,1,2,3,4,5,6,7,8,9,1,2,3,4,5,6,7,8,9]
V=[1,1,1,2,4,8,6,4,2,1,2,3,4,5,6,7,6,5,1,1,1,1,4,8,4,2,1,1,1,3,4,5,6,6,5,4]
key=['a', 'b', 'c', 'd']
K = [y for x in key for y in (x)*9]
demo = pd.DataFrame({'X': X, 'V': V, 'K': K})
# find X position where the sum of V's of K's is the maximum (this is at X=6)
idxmax = demo.groupby(["X"]).sum().idxmax()[0]
# find the cumulative sum of V's at position idxmax and
# take away some offset (4) so the labels go down a bit
# iloc[::-1] reverses the order because we want cumulative to start from the bottom (from 'd')
ypos = demo.groupby(["X", "K"]).sum().loc[idxmax].iloc[::-1].cumsum()["V"] - 4
# crate a new dataframe for the text, X column=idmax, Y column=cumulative ypos, K=key
demotext = pd.DataFrame([[idxmax, y, k] for y,k in zip(ypos.tolist(), key[::-1])],
columns=["X", "Y", "K"])
a = (alt.Chart(demo).mark_area()
.encode(
x='X:O',
y='V:Q',
color='K:N')
)
t = (alt.Chart(demotext).mark_text()
.encode(
x='X:O',
y='Y:Q',
text='K:N'
))
a+t
输出
我已经意识到,可能无法以编程方式执行此操作,或者可能不值得这样做,因为涉及的复杂性(例如未对齐的峰)。 修改上面给出的数据稍微突出了这个问题-
import pandas as pd
import altair as alt
X=[1,2,3,4,5,6,7,8,9,1,2,3,4,5,6,7,8,9,1,2,3,4,5,6,7,8,9,1,2,3,4,5,6,7,8,9]
V=[1,1,1,2,4,8,6,4,2,1,2,3,4,5,6,7,6,5,1,1,1,1,4,8,4,2,1,1,5,9,5,3,1,1,1,1]
key=['a', 'b', 'c', 'd']
K = [y for x in key for y in (x)*9]
demo = pd.DataFrame({'X': X, 'V': V, 'K': K})
# get the x and y positions for max values of the graph
demo.groupby('K').max()
# crate a new dataframe for the text, X column=idmax, Y column=cumulative ypos, K=key
demotext = pd.DataFrame([[idxmax, y, k] for y,k in zip(ypos.tolist(), key[::-1])],
columns=["X", "Y", "K"])
a = (alt.Chart(demo).mark_area()
.encode(
x='X:O',
y='V:Q',
color='K:N')
)
t = (alt.Chart(demotext).mark_text()
.encode(
x='X:O',
y='Y:Q',
text='K:N'
))
a+t
如果你有一个非常复杂的图表,我认为最好和最简单的方法是手动构建文本数据 -
a = (alt.Chart(demo).mark_area()
.encode(
x='X:O',
y='V:Q',
color='K:N')
)
t = (alt.Chart(p).mark_text()
.encode(
x='X:O',
y='csum:Q',
text='K:N'
))
a+t
其中 p
是 -
K V X csum
0 a 8 6 18
1 b 7 7 7
2 c 8 6 4
3 d 9 3 4