如何分组、计数或求和,然后在 Pandas 中绘制两条线?
How do I groupby, count or sum and then plot two lines in Pandas?
假设我有以下数据帧:
Earthquakes
:
latitude longitude place year
0 36.087000 -106.168000 New Mexico 1973
1 33.917000 -90.775000 Mississippi 1973
2 37.160000 -104.594000 Colorado 1973
3 37.148000 -104.571000 Colorado 1973
4 36.500000 -100.693000 Oklahoma 1974
… … … … …
13941 36.373500 -96.818700 Oklahoma 2016
13942 36.412200 -96.882400 Oklahoma 2016
13943 37.277167 -98.072667 Kansas 2016
13944 36.939300 -97.896000 Oklahoma 2016
13945 36.940500 -97.906300 Oklahoma 2016
和Wells
:
LAT LONG BBLS Year
0 36.900324 -98.218260 300.0 1977
1 36.896636 -98.177720 1000.0 2002
2 36.806113 -98.325840 1000.0 1988
3 36.888589 -98.318530 1000.0 1985
4 36.892128 -98.194620 2400.0 2002
… … … … …
11117 36.263285 -99.557631 1000.0 2007
11118 36.263220 -99.548647 1000.0 2007
11119 36.520160 -99.334183 19999.0 2016
11120 36.276728 -99.298563 19999.0 2016
11121 36.436857 -99.137391 60000.0 2012
我如何制作一个折线图来显示每年的 BBLS 数量(来自 Wells
),以及一年中发生的地震数量(来自 Earthquakes
),其中x 轴表示自 1980 年以来的年份,y1 轴表示每年 BBLS 的总和,而 y2 轴表示地震次数。
我认为我需要进行分组、计数(用于地震)和求和(用于 BBLS)才能制作情节,但我真的尝试了很多编码,但我只是不知道如何去做。
唯一有点用的是地震线图,如下所示:
Earthquakes.pivot_table(index=['year'],columns='type',aggfunc='size').plot(kind='line')
仍然,对于 BBLS 的折线图,没有任何效果
Wells.pivot_table(index=['Year'],columns='BBLS',aggfunc='count').plot(kind='line')
这一个:
plt.plot(Wells['Year'].values, Wells['BBL'].values, label='Barrels Produced')
plt.legend() # Plot legends (the two labels)
plt.xlabel('Year') # Set x-axis text
plt.ylabel('Earthquakes') # Set y-axis text
plt.show() # Display plot
这个来自另一个 或者:
fig, ax = plt.subplots(figsize=(10,8))
Earthquakes.plot(ax = ax, marker='v')
ax.title.set_text('Earthquakes and Injection Wells')
ax.set_ylabel('Earthquakes')
ax.set_xlabel('Year')
ax.set_xticks(Earthquakes['year'])
ax2=ax.twinx()
ax2.plot(Wells.Year, Wells.BBL, color='c',
linewidth=2.0, label='Number of Barrels', marker='o')
ax2.set_ylabel('Annual Number of Barrels')
lines_1, labels_1 = ax.get_legend_handles_labels()
lines_2, labels_2 = ax2.get_legend_handles_labels()
lines = lines_1 + lines_2
labels = labels_1 + labels_2
ax.legend(lines, labels, loc='upper center')
输入数据:
>>> df2 # Earthquakes
year
0 2007
1 1974
2 1979
3 1992
4 2006
.. ...
495 2002
496 2011
497 1971
498 1977
499 1985
[500 rows x 1 columns]
>>> df1 # Wells
BBLS year
0 16655 1997
1 7740 1998
2 37277 2000
3 20195 2014
4 11882 2018
.. ... ...
495 30832 1981
496 24770 2018
497 14949 1980
498 24743 1975
499 46933 2019
[500 rows x 2 columns]
准备绘图数据:
data1 = df1.value_counts("year").sort_index().rename("Earthquakes")
data2 = df2.groupby("year")["BBLS"].sum()
简单的情节:
ax1 = data1.plot(legend=data1.name, color="blue")
ax2 = data2.plot(legend=data2.name, color="red", ax=ax1.twinx())
现在,您可以用 2 个轴做任何事情。
更可控的图表
# Figure and axis
fig, ax1 = plt.subplots()
ax2 = ax1.twinx()
# Data
line1, = ax1.plot(data1.index, data1.values, label="Earthquakes", color="b")
line2, = ax2.plot(data2.index, data2.values / 10**6, label="Barrels", color="r")
# Legend
lines = [line1, line2]
ax1.legend(lines, [line.get_label() for line in lines])
# Titles
ax1.set_title("")
ax1.set_xlabel("Year")
ax1.set_ylabel("Earthquakes")
ax2.set_ylabel("Barrels Produced (MMbbl)")
假设我有以下数据帧:
Earthquakes
:
latitude longitude place year
0 36.087000 -106.168000 New Mexico 1973
1 33.917000 -90.775000 Mississippi 1973
2 37.160000 -104.594000 Colorado 1973
3 37.148000 -104.571000 Colorado 1973
4 36.500000 -100.693000 Oklahoma 1974
… … … … …
13941 36.373500 -96.818700 Oklahoma 2016
13942 36.412200 -96.882400 Oklahoma 2016
13943 37.277167 -98.072667 Kansas 2016
13944 36.939300 -97.896000 Oklahoma 2016
13945 36.940500 -97.906300 Oklahoma 2016
和Wells
:
LAT LONG BBLS Year
0 36.900324 -98.218260 300.0 1977
1 36.896636 -98.177720 1000.0 2002
2 36.806113 -98.325840 1000.0 1988
3 36.888589 -98.318530 1000.0 1985
4 36.892128 -98.194620 2400.0 2002
… … … … …
11117 36.263285 -99.557631 1000.0 2007
11118 36.263220 -99.548647 1000.0 2007
11119 36.520160 -99.334183 19999.0 2016
11120 36.276728 -99.298563 19999.0 2016
11121 36.436857 -99.137391 60000.0 2012
我如何制作一个折线图来显示每年的 BBLS 数量(来自 Wells
),以及一年中发生的地震数量(来自 Earthquakes
),其中x 轴表示自 1980 年以来的年份,y1 轴表示每年 BBLS 的总和,而 y2 轴表示地震次数。
我认为我需要进行分组、计数(用于地震)和求和(用于 BBLS)才能制作情节,但我真的尝试了很多编码,但我只是不知道如何去做。
唯一有点用的是地震线图,如下所示:
Earthquakes.pivot_table(index=['year'],columns='type',aggfunc='size').plot(kind='line')
仍然,对于 BBLS 的折线图,没有任何效果
Wells.pivot_table(index=['Year'],columns='BBLS',aggfunc='count').plot(kind='line')
这一个:
plt.plot(Wells['Year'].values, Wells['BBL'].values, label='Barrels Produced')
plt.legend() # Plot legends (the two labels)
plt.xlabel('Year') # Set x-axis text
plt.ylabel('Earthquakes') # Set y-axis text
plt.show() # Display plot
这个来自另一个
fig, ax = plt.subplots(figsize=(10,8))
Earthquakes.plot(ax = ax, marker='v')
ax.title.set_text('Earthquakes and Injection Wells')
ax.set_ylabel('Earthquakes')
ax.set_xlabel('Year')
ax.set_xticks(Earthquakes['year'])
ax2=ax.twinx()
ax2.plot(Wells.Year, Wells.BBL, color='c',
linewidth=2.0, label='Number of Barrels', marker='o')
ax2.set_ylabel('Annual Number of Barrels')
lines_1, labels_1 = ax.get_legend_handles_labels()
lines_2, labels_2 = ax2.get_legend_handles_labels()
lines = lines_1 + lines_2
labels = labels_1 + labels_2
ax.legend(lines, labels, loc='upper center')
输入数据:
>>> df2 # Earthquakes
year
0 2007
1 1974
2 1979
3 1992
4 2006
.. ...
495 2002
496 2011
497 1971
498 1977
499 1985
[500 rows x 1 columns]
>>> df1 # Wells
BBLS year
0 16655 1997
1 7740 1998
2 37277 2000
3 20195 2014
4 11882 2018
.. ... ...
495 30832 1981
496 24770 2018
497 14949 1980
498 24743 1975
499 46933 2019
[500 rows x 2 columns]
准备绘图数据:
data1 = df1.value_counts("year").sort_index().rename("Earthquakes")
data2 = df2.groupby("year")["BBLS"].sum()
简单的情节:
ax1 = data1.plot(legend=data1.name, color="blue")
ax2 = data2.plot(legend=data2.name, color="red", ax=ax1.twinx())
现在,您可以用 2 个轴做任何事情。
更可控的图表
# Figure and axis
fig, ax1 = plt.subplots()
ax2 = ax1.twinx()
# Data
line1, = ax1.plot(data1.index, data1.values, label="Earthquakes", color="b")
line2, = ax2.plot(data2.index, data2.values / 10**6, label="Barrels", color="r")
# Legend
lines = [line1, line2]
ax1.legend(lines, [line.get_label() for line in lines])
# Titles
ax1.set_title("")
ax1.set_xlabel("Year")
ax1.set_ylabel("Earthquakes")
ax2.set_ylabel("Barrels Produced (MMbbl)")