python 堆积面积图

Question

我正在尝试创建堆积面积图，显示课程及其数量随时间的演变。所以我的数据框是 (index=Year):

                    Area  Courses
Year                             
1900         Agriculture      0.0
1900        Architecture     32.0
1900           Astronomy     10.0
1900             Biology     20.0
1900           Chemistry     25.0
1900   Civil Engineering     21.0
1900           Education     14.0
1900  Engineering Design     10.0
1900             English     30.0
1900           Geography      1.0

去年：2011.

我尝试了几种解决方案，例如df.plot.area()、df.plot.area(x='Years')。然后我认为将区域作为列会有所帮助，所以我尝试了

df.pivot_table(index = 'Year', columns = 'Area', values = 'Courses', aggfunc = 'sum')

但我没有得到每年的课程总和，而是得到了：

Area  Aeronautical Engineering  ...  Visual Design
Year                            ...               
1900                       NaN  ...            NaN
1901                       NaN  ...            NaN

感谢您的帮助。这是我的第一个 post。对不起，如果我错过了什么。

更新。这是我的代码：

df = pd.read_csv(filepath, encoding= 'unicode_escape')
df = df.groupby(['Year','GenArea'])['Taught'].sum().to_frame(name = 'Courses').reset_index()
plt.stackplot(df['Year'], df['Courses'], labels = df['GenArea'])
plt.legend(loc='upper left')
plt.show()

这里是数据集的 link：https://data.world/makeovermonday/2020w12

Answer 1

根据额外给定的信息，我做了这个。希望你喜欢！

import pandas as pd
import matplotlib.pyplot as plt

plt.close('all')

df=pd.read_csv('https://query.data.world/s/djx5mi7dociacx7smdk45pfmwp3vjo',
               encoding='unicode_escape')
df=df.groupby(['Year','GenArea'])['Taught'].sum().to_frame(name=
             'Courses').reset_index()
aux1=df.duplicated(subset='GenArea', keep='first').values
aux2=df.duplicated(subset='Year', keep='first').values

n=len(aux1);year=[];courses=[]

for i in range(n):
    if not aux1[i]:
        courses.append(df.iloc[i]['GenArea'])
    if not aux2[i]:
        year.append(df.iloc[i]['Year'])
    else:
        continue

del aux1,aux2
df1=pd.DataFrame(index=year)
s=0

for i in range(len(courses)):
    df1[courses[i]]=0
for i in range(n):
    string=df.iloc[i]['GenArea']
    if any(df1.iloc[s].values==0):
        df1.at[year[s],string]=df.iloc[i]['Courses']
    else:
        s+=1
        df1.at[year[s],string]=df.iloc[i]['Courses']

del year,courses,df
df1=df1[df1.columns[::-1]]
df1.plot.area(legend='reverse')

python 堆积面积图

python Stacked area chart

python

stacked-area-chart