如何将由 .pct_change() 数据制成的注释添加到线图
How to add annotation made of .pct_change() data to line plot
我有这些数据:
values = [["Arts & Humanities",19.00, 13.43, 7.21, 5.11, 2.64], ["Life Sciences &
Biomedicine", 64.41, 53.89, 45.01, 32.44, 14.82],
["Physical Sciences", 43.62, 37.26, 30.72, 19.71, 8.30],
["Social Sciences", 50.71, 42.32, 34.19, 26.85, 12.47], ["Technology", 52.48, 49.28, 36.65, 29.25, 14.77]]
我已经绘制了这些数据的线图。
data = pd.DataFrame(values, columns = ["Research_categories",'2017', '2018', '2019', '2020', '2021'])
data.set_index('Research_categories', inplace=True)
df = data.T
plot = df.plot()
plt.subplots_adjust(right=0.869)
plt.show()
现在我需要为年中的每个点添加注释。该注释应该由百分比变化组成。所以我准备了这个数据框:
percentage_df = data.pct_change(axis='columns')
这个数据框是这样的:
2017 2018 2019 2020 2021
Research_categories
Arts & Humanities NaN -0.293158 -0.463142 -0.291262 -0.483366
Life Sciences & Biomedicine NaN -0.163329 -0.164780 -0.279271 -0.543157
Physical Sciences NaN -0.145805 -0.175523 -0.358398 -0.578894
Social Sciences NaN -0.165451 -0.192108 -0.214683 -0.535568
Technology NaN -0.060976 -0.256291 -0.201910 -0.495043
如何从此数据框中获取数据并将其显示为绘图中的注释?
我对 Python 中的可视化非常陌生。到目前为止,这对我来说是非常棘手的部分。如果有任何帮助,我将不胜感激。非常感谢您的帮助!
Matplotlib 有一个 built-in annotation 函数,您只需在其中指定注释的值和您想要的坐标。
在你的例子中,我们只需要遍历两个数据帧来获取数据的y-value(来自data
)和要写在图表上的值(来自percentage_df
).
for i, column in enumerate(data):
if not column == '2017': #no point plotting NANs
for val1, val2 in zip(data[column], percentage_df[column]):
plot.annotate(
text = val2,
xy = (i, val1), #must use counter as data is plotted as categorical
)
请注意,由于您的数据在技术上是分类的(年份是字符串而不是数字),我们需要使用枚举来获得一个计数器,它为我们提供了一个 x-position 的注释。
这给出了下图:
符合您的标准,但看起来很糟糕。因此,让我们将其变大并将数字四舍五入到小数点后两位。
完整代码:
import pandas as pd
import matplotlib.pyplot as plt
values = [["Arts & Humanities",19.00, 13.43, 7.21, 5.11, 2.64],
["Life Sciences & Biomedicine", 64.41, 53.89, 45.01, 32.44, 14.82],
["Physical Sciences", 43.62, 37.26, 30.72, 19.71, 8.30],
["Social Sciences", 50.71, 42.32, 34.19, 26.85, 12.47],
["Technology", 52.48, 49.28, 36.65, 29.25, 14.77]
]
data = pd.DataFrame(values, columns = ["Research_categories",'2017', '2018', '2019', '2020', '2021'])
data.set_index('Research_categories', inplace=True)
df = data.T
fig, ax = plt.subplots(1,1, figsize = (8,5), dpi = 150)
df.plot(ax=ax)
percentage_df = data.pct_change(axis='columns')
for i, column in enumerate(data):
if not column == '2017': #no point plotting NANs
for val1, val2 in zip(data[column], percentage_df[column]):
ax.annotate(
text = round(val2, 2),
xy = (i, val1), #must use counter as data is plotted as categorical
)
plt.show()
我有这些数据:
values = [["Arts & Humanities",19.00, 13.43, 7.21, 5.11, 2.64], ["Life Sciences &
Biomedicine", 64.41, 53.89, 45.01, 32.44, 14.82],
["Physical Sciences", 43.62, 37.26, 30.72, 19.71, 8.30],
["Social Sciences", 50.71, 42.32, 34.19, 26.85, 12.47], ["Technology", 52.48, 49.28, 36.65, 29.25, 14.77]]
我已经绘制了这些数据的线图。
data = pd.DataFrame(values, columns = ["Research_categories",'2017', '2018', '2019', '2020', '2021'])
data.set_index('Research_categories', inplace=True)
df = data.T
plot = df.plot()
plt.subplots_adjust(right=0.869)
plt.show()
现在我需要为年中的每个点添加注释。该注释应该由百分比变化组成。所以我准备了这个数据框:
percentage_df = data.pct_change(axis='columns')
这个数据框是这样的:
2017 2018 2019 2020 2021
Research_categories
Arts & Humanities NaN -0.293158 -0.463142 -0.291262 -0.483366
Life Sciences & Biomedicine NaN -0.163329 -0.164780 -0.279271 -0.543157
Physical Sciences NaN -0.145805 -0.175523 -0.358398 -0.578894
Social Sciences NaN -0.165451 -0.192108 -0.214683 -0.535568
Technology NaN -0.060976 -0.256291 -0.201910 -0.495043
如何从此数据框中获取数据并将其显示为绘图中的注释?
我对 Python 中的可视化非常陌生。到目前为止,这对我来说是非常棘手的部分。如果有任何帮助,我将不胜感激。非常感谢您的帮助!
Matplotlib 有一个 built-in annotation 函数,您只需在其中指定注释的值和您想要的坐标。
在你的例子中,我们只需要遍历两个数据帧来获取数据的y-value(来自data
)和要写在图表上的值(来自percentage_df
).
for i, column in enumerate(data):
if not column == '2017': #no point plotting NANs
for val1, val2 in zip(data[column], percentage_df[column]):
plot.annotate(
text = val2,
xy = (i, val1), #must use counter as data is plotted as categorical
)
请注意,由于您的数据在技术上是分类的(年份是字符串而不是数字),我们需要使用枚举来获得一个计数器,它为我们提供了一个 x-position 的注释。
这给出了下图:
符合您的标准,但看起来很糟糕。因此,让我们将其变大并将数字四舍五入到小数点后两位。
完整代码:
import pandas as pd
import matplotlib.pyplot as plt
values = [["Arts & Humanities",19.00, 13.43, 7.21, 5.11, 2.64],
["Life Sciences & Biomedicine", 64.41, 53.89, 45.01, 32.44, 14.82],
["Physical Sciences", 43.62, 37.26, 30.72, 19.71, 8.30],
["Social Sciences", 50.71, 42.32, 34.19, 26.85, 12.47],
["Technology", 52.48, 49.28, 36.65, 29.25, 14.77]
]
data = pd.DataFrame(values, columns = ["Research_categories",'2017', '2018', '2019', '2020', '2021'])
data.set_index('Research_categories', inplace=True)
df = data.T
fig, ax = plt.subplots(1,1, figsize = (8,5), dpi = 150)
df.plot(ax=ax)
percentage_df = data.pct_change(axis='columns')
for i, column in enumerate(data):
if not column == '2017': #no point plotting NANs
for val1, val2 in zip(data[column], percentage_df[column]):
ax.annotate(
text = round(val2, 2),
xy = (i, val1), #must use counter as data is plotted as categorical
)
plt.show()