双轴上带有 pandas DateTimeIndex 的热图
Heatmap with pandas DateTimeIndex on both axis
我想使用 DateTimeIndex 从 pandas DataFrame(或系列)制作热图,这样我在 x 轴上有小时,在 y 轴上有天,两个刻度标签都以 DateTimeIndex 样式显示.
如果我执行以下操作:
import pandas as pd
import numpy as np
import seaborn as sns
df = pd.DataFrame(np.random.randint(10, size=4*24*200))
df.index = pd.date_range(start='2019-02-01 11:30:00', periods=200*24*4, freq='15min')
df['minute'] = df.index.hour*60 + df.index.minute
df['dayofyear'] = df.index.month + df.index.dayofyear
df = df.pivot(index='dayofyear', columns='minute', values=df.columns[0])
sns.heatmap(df)
索引明显丢失了日期时间格式:
我想要的是这样的东西(我用一个复杂的、不可泛化的函数实现的,显然甚至不能正常工作):
有人知道用 python 创建这种热图的巧妙方法吗?
编辑:
我创建的函数:
def plot_heatmap(df_in, plot_column=0, figsize=(20,12), vmin=None, vmax=None, cmap='jet', xlabel='hour (UTC)', ylabel='day', rotation=0, freq='5s'):
'''
Plots heatmap with date labels
df_in: pandas DataFrame od pandas Series
plot_column: column to plot if DataFrame has multiple columns
...
'''
# convert to DataFrame in case a Series is passed:
try:
df_in = df_in.to_frame()
except AttributeError:
pass
# make copy in order not to overrite input (in case input is an object attribute)
df = df_in.copy()
# pad missing dates:
idx = pd.date_range(df_in.index[0], df_in.index[-1], freq=freq)
df = df.reindex(idx, fill_value=np.nan)
df['hour'] = df.index.hour*3600 + df.index.minute*60 + df.index.second
df['dayofyear'] = df.index.month + df.index.dayofyear
# Create mesh for heatmap plotting:
pivot = df.pivot(index='dayofyear', columns='hour', values=df.columns[plot_column])
# plot
plt.figure(figsize=figsize)
sns.heatmap(pivot, cmap=cmap)
# set xticks
plt.xticks(np.linspace(0,pivot.shape[1],25), labels=range(25))
plt.xlabel(xlabel)
# set yticks
ylabels = []
ypositions = []
day0 = df['dayofyear'].unique().min()
for day in df['dayofyear'].unique():
day_delta = day-day0
# create pandas Timestamp
temp_tick = df.index[0] + pd.Timedelta('%sD' %day_delta)
# check wheter tick shall be shown or not
if temp_tick.day==1 or temp_tick.day==15:
temp_tick_nice = '%s-%s-%s' %(temp_tick.year, temp_tick.month, temp_tick.day)
ylabels.append(temp_tick_nice)
ypositions.append(day_delta)
plt.yticks(ticks=ypositions, labels=ylabels, rotation=0)
plt.ylabel(ylabel)
日期格式将消失,因为您这样做了:
df['dayofyear'] = df.index.month + df.index.dayofyear
这里,两个数列都是整数,所以df['dayofyear']
也是integer-typed。
相反,执行:
df['dayofyear'] = df.index.date
然后你得到输出:
如果 DatetimeIndex 的频率小于 1 分钟,我现在找到的最佳解决方案如下:
import pandas as pd
import numpy as np
import seaborn as sns
freq = '30s'
df = pd.DataFrame(np.random.randint(10, size=4*24*200*20))
df.index = pd.date_range(start='2019-02-01 11:30:00', periods=200*24*4*20, freq=freq)
df['hour'] = df.index.strftime('%H:%M:%S')
df['dayofyear'] = df.index.date
df = df.pivot(index='dayofyear', columns='hour', values=df.columns[0])
df.columns = pd.DatetimeIndex(df.columns).strftime('%H:%M')
df.index = pd.DatetimeIndex(df.index).strftime('%m/%Y')
xticks_spacing = int(pd.Timedelta('2h')/pd.Timedelta(freq))
ax = sns.heatmap(df, xticklabels=xticks_spacing, yticklabels=30)
plt.yticks(rotation=0)
产生这个结果:
唯一的缺陷是使用此方法时月刻度位置没有很好地定义和精确...
我想使用 DateTimeIndex 从 pandas DataFrame(或系列)制作热图,这样我在 x 轴上有小时,在 y 轴上有天,两个刻度标签都以 DateTimeIndex 样式显示.
如果我执行以下操作:
import pandas as pd
import numpy as np
import seaborn as sns
df = pd.DataFrame(np.random.randint(10, size=4*24*200))
df.index = pd.date_range(start='2019-02-01 11:30:00', periods=200*24*4, freq='15min')
df['minute'] = df.index.hour*60 + df.index.minute
df['dayofyear'] = df.index.month + df.index.dayofyear
df = df.pivot(index='dayofyear', columns='minute', values=df.columns[0])
sns.heatmap(df)
索引明显丢失了日期时间格式:
我想要的是这样的东西(我用一个复杂的、不可泛化的函数实现的,显然甚至不能正常工作):
有人知道用 python 创建这种热图的巧妙方法吗?
编辑:
我创建的函数:
def plot_heatmap(df_in, plot_column=0, figsize=(20,12), vmin=None, vmax=None, cmap='jet', xlabel='hour (UTC)', ylabel='day', rotation=0, freq='5s'):
'''
Plots heatmap with date labels
df_in: pandas DataFrame od pandas Series
plot_column: column to plot if DataFrame has multiple columns
...
'''
# convert to DataFrame in case a Series is passed:
try:
df_in = df_in.to_frame()
except AttributeError:
pass
# make copy in order not to overrite input (in case input is an object attribute)
df = df_in.copy()
# pad missing dates:
idx = pd.date_range(df_in.index[0], df_in.index[-1], freq=freq)
df = df.reindex(idx, fill_value=np.nan)
df['hour'] = df.index.hour*3600 + df.index.minute*60 + df.index.second
df['dayofyear'] = df.index.month + df.index.dayofyear
# Create mesh for heatmap plotting:
pivot = df.pivot(index='dayofyear', columns='hour', values=df.columns[plot_column])
# plot
plt.figure(figsize=figsize)
sns.heatmap(pivot, cmap=cmap)
# set xticks
plt.xticks(np.linspace(0,pivot.shape[1],25), labels=range(25))
plt.xlabel(xlabel)
# set yticks
ylabels = []
ypositions = []
day0 = df['dayofyear'].unique().min()
for day in df['dayofyear'].unique():
day_delta = day-day0
# create pandas Timestamp
temp_tick = df.index[0] + pd.Timedelta('%sD' %day_delta)
# check wheter tick shall be shown or not
if temp_tick.day==1 or temp_tick.day==15:
temp_tick_nice = '%s-%s-%s' %(temp_tick.year, temp_tick.month, temp_tick.day)
ylabels.append(temp_tick_nice)
ypositions.append(day_delta)
plt.yticks(ticks=ypositions, labels=ylabels, rotation=0)
plt.ylabel(ylabel)
日期格式将消失,因为您这样做了:
df['dayofyear'] = df.index.month + df.index.dayofyear
这里,两个数列都是整数,所以df['dayofyear']
也是integer-typed。
相反,执行:
df['dayofyear'] = df.index.date
然后你得到输出:
如果 DatetimeIndex 的频率小于 1 分钟,我现在找到的最佳解决方案如下:
import pandas as pd
import numpy as np
import seaborn as sns
freq = '30s'
df = pd.DataFrame(np.random.randint(10, size=4*24*200*20))
df.index = pd.date_range(start='2019-02-01 11:30:00', periods=200*24*4*20, freq=freq)
df['hour'] = df.index.strftime('%H:%M:%S')
df['dayofyear'] = df.index.date
df = df.pivot(index='dayofyear', columns='hour', values=df.columns[0])
df.columns = pd.DatetimeIndex(df.columns).strftime('%H:%M')
df.index = pd.DatetimeIndex(df.index).strftime('%m/%Y')
xticks_spacing = int(pd.Timedelta('2h')/pd.Timedelta(freq))
ax = sns.heatmap(df, xticklabels=xticks_spacing, yticklabels=30)
plt.yticks(rotation=0)
产生这个结果:
唯一的缺陷是使用此方法时月刻度位置没有很好地定义和精确...