从时间序列图的 Pandas 数据框中删除时间戳、日期和月份

Removing timestamp, day and month from Pandas dataframe for a time series plot

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

我有两个数据框,我想为其绘制两个时间序列。我希望时间序列相互堆叠,我正在努力为 Pandas 数据帧工作。此外,我对 x-tickers 有疑问。

日期和时间以字符串形式给出,并且已按时间顺序排列。这是数据样本(取自更大的数据集)

df1 = pd.DataFrame([["2004-03-01 00:00", 2.3],
              ["2004-03-05 00:00", 2.4],
              ["2004-03-25 00:00", 2.25],
              ["2004-07-01 00:00", 2.7],
              ["2005-01-01 00:00", 2.9],
              ["2005-02-17 00:00", 3.1],
              ["2005-12-01 00:00", 3.5],
              ["2006-02-01 00:00", 3.3],
              ["2006-04-05 00:00", 3.08],
              ["2006-08-22 00:00", 2.4],
              ["2007-07-01 00:00", 2.1]], columns = ['Date and Time', 'Values 1'])


df2 = pd.DataFrame([["2004-03-01 00:00", 12.3],
              ["2004-03-05 00:00", 14.5],
              ["2004-03-25 00:00", 12.1],
              ["2004-07-01 00:00", 10.0],
              ["2005-01-01 00:00", 12.1],
              ["2005-02-17 00:00", 9.3],
              ["2005-12-01 00:00", 8.1],
              ["2006-02-01 00:00", 6.5],
              ["2006-04-05 00:00", 7.5],
              ["2006-08-22 00:00", 6.4],
              ["2007-07-01 00:00", 4.1]], columns = ['Date and Time', 'Values 2'])

首先。如果我尝试绘制 df1,

df1.plot(x='Date and Time', y='Values 1',  legend=False)
plt.xlabel('Year')
plt.ylabel('Values 1')
plt.show()

输出是我想要的图表,但 x-tickers 的格式是年-月-日-时间。在这个例子中,我只希望 "years" 2004、2005、2006、2007 显示为代码,更重要的是,它们被正确缩放(因此 2005 年的代码将接近“2005-01-01”数据点).这可能吗?

此外,我想绘制这些堆叠在一起的图形,我尝试了下面的代码无济于事。

plt.figure(1)

plt.subplot(211)
df1.plot(x='Date and Time', y='Values 1', legend=False)
plt.xlabel('Year')
plt.ylabel('Values 1')

plt.subplot(212)
df2.plot(x='Date and Time', y='Values 2', legend=False)
plt.xlabel('Year')
plt.ylabel('Values 2')

plt.show()

让我们在 xaxis 和 plt.subplots:

上使用 set_major_formatterset_major_locator
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import matplotlib.dates as mdates

df1 = pd.DataFrame([["2004-03-01 00:00", 2.3],
              ["2004-03-05 00:00", 2.4],
              ["2004-03-25 00:00", 2.25],
              ["2004-07-01 00:00", 2.7],
              ["2005-01-01 00:00", 2.9],
              ["2005-02-17 00:00", 3.1],
              ["2005-12-01 00:00", 3.5],
              ["2006-02-01 00:00", 3.3],
              ["2006-04-05 00:00", 3.08],
              ["2006-08-22 00:00", 2.4],
              ["2007-07-01 00:00", 2.1]], columns = ['Date and Time', 'Values 1'])


df2 = pd.DataFrame([["2004-03-01 00:00", 12.3],
              ["2004-03-05 00:00", 14.5],
              ["2004-03-25 00:00", 12.1],
              ["2004-07-01 00:00", 10.0],
              ["2005-01-01 00:00", 12.1],
              ["2005-02-17 00:00", 9.3],
              ["2005-12-01 00:00", 8.1],
              ["2006-02-01 00:00", 6.5],
              ["2006-04-05 00:00", 7.5],
              ["2006-08-22 00:00", 6.4],
              ["2007-07-01 00:00", 4.1]], columns = ['Date and Time', 'Values 2'])

yearFmt = mdates.DateFormatter('%Y')
years = mdates.YearLocator()  


df1['Date and Time'] = pd.to_datetime(df1['Date and Time'])
fig, ax = plt.subplots(2,1, figsize=(10,10))
df1.plot(x='Date and Time', y='Values 1', legend=False, ax=ax[0])
ax[0].set_xlabel('Year')
ax[0].set_ylabel('Values 1')
ax[0].xaxis.set_major_formatter(yearFmt)
ax[0].xaxis.set_major_locator(years)


df2['Date and Time'] = pd.to_datetime(df2['Date and Time'])
df2.plot(x='Date and Time', y='Values 2', legend=False, ax=ax[1])
ax[1].set_xlabel('Year')
ax[1].set_ylabel('Values 2')
ax[1].xaxis.set_major_formatter(yearFmt)
ax[1].xaxis.set_major_locator(years)

plt.tight_layout()
plt.show()

输出:

首先,你应该在matplotlib中转向面向对象API。这将是我在这个答案的剩余部分中使用的内容。

df1['Date and Time'] = pd.to_datetime(df1['Date and Time'])
fig, ax = plt.subplots()
df1.plot(x='Date and Time', y='Values 1', legend=False, ax=ax)
plot_ticks = date_and_time.groupby(date_and_time.dt.year).first()
ax.set_xticks(plot_ticks.values)
ax.set_xticklabels(plot_ticks.dt.date.values)