Pandas:windows 数组的滚动平均值
Pandas: Rolling mean over array of windows
类似于,我可以计算多个rolling means
d1 = df.set_index('DateTime').sort_index()
ma_1h = d1.groupby('Event').rolling('H').mean()
ma_2h = d1.groupby('Event').rolling('2H').mean()
但是如果我想对数组列表执行此操作,我该如何高效地执行此操作?
window_array = ['H','3H','6H','9H'] # etc
并且我的滚动方式被包含回我的原始数据框中
我相信您需要转换偏移量并通过列表理解在循环中创建新的 DataFrame
s,最后 concat
:
from pandas.tseries.frequencies import to_offset
df1 = pd.concat([d1.groupby('Event').rolling(to_offset(x)).mean() for x in window_array],
axis=1,
keys=window_array)
样本:
rng = pd.date_range('2017-04-03', periods=10, freq='38T')
df = pd.DataFrame({'DateTime': rng, 'a': range(10), 'Event':[4] * 3 + [3] * 3 + [1] * 4})
print (df)
from pandas.tseries.frequencies import to_offset
window_array = ['H','3H','6H','9H']
d1 = df.set_index('DateTime').sort_index()
a = pd.concat([d1.groupby('Event')['a'].rolling(to_offset(x)).mean() for x in window_array],
axis=1,
keys=window_array)
print (a)
H 3H 6H 9H
Event DateTime
1 2017-04-03 03:48:00 6.0 6.0 6.0 6.0
2017-04-03 04:26:00 6.5 6.5 6.5 6.5
2017-04-03 05:04:00 7.5 7.0 7.0 7.0
2017-04-03 05:42:00 8.5 7.5 7.5 7.5
3 2017-04-03 01:54:00 3.0 3.0 3.0 3.0
2017-04-03 02:32:00 3.5 3.5 3.5 3.5
2017-04-03 03:10:00 4.5 4.0 4.0 4.0
4 2017-04-03 00:00:00 0.0 0.0 0.0 0.0
2017-04-03 00:38:00 0.5 0.5 0.5 0.5
2017-04-03 01:16:00 1.5 1.0 1.0 1.0
window_array = ['H','3H','6H','9H'] # etc
for window in window_array:
d1[window] = d1.groupby('Event').rolling(window).mean()
类似于
d1 = df.set_index('DateTime').sort_index()
ma_1h = d1.groupby('Event').rolling('H').mean()
ma_2h = d1.groupby('Event').rolling('2H').mean()
但是如果我想对数组列表执行此操作,我该如何高效地执行此操作?
window_array = ['H','3H','6H','9H'] # etc
并且我的滚动方式被包含回我的原始数据框中
我相信您需要转换偏移量并通过列表理解在循环中创建新的 DataFrame
s,最后 concat
:
from pandas.tseries.frequencies import to_offset
df1 = pd.concat([d1.groupby('Event').rolling(to_offset(x)).mean() for x in window_array],
axis=1,
keys=window_array)
样本:
rng = pd.date_range('2017-04-03', periods=10, freq='38T')
df = pd.DataFrame({'DateTime': rng, 'a': range(10), 'Event':[4] * 3 + [3] * 3 + [1] * 4})
print (df)
from pandas.tseries.frequencies import to_offset
window_array = ['H','3H','6H','9H']
d1 = df.set_index('DateTime').sort_index()
a = pd.concat([d1.groupby('Event')['a'].rolling(to_offset(x)).mean() for x in window_array],
axis=1,
keys=window_array)
print (a)
H 3H 6H 9H
Event DateTime
1 2017-04-03 03:48:00 6.0 6.0 6.0 6.0
2017-04-03 04:26:00 6.5 6.5 6.5 6.5
2017-04-03 05:04:00 7.5 7.0 7.0 7.0
2017-04-03 05:42:00 8.5 7.5 7.5 7.5
3 2017-04-03 01:54:00 3.0 3.0 3.0 3.0
2017-04-03 02:32:00 3.5 3.5 3.5 3.5
2017-04-03 03:10:00 4.5 4.0 4.0 4.0
4 2017-04-03 00:00:00 0.0 0.0 0.0 0.0
2017-04-03 00:38:00 0.5 0.5 0.5 0.5
2017-04-03 01:16:00 1.5 1.0 1.0 1.0
window_array = ['H','3H','6H','9H'] # etc
for window in window_array:
d1[window] = d1.groupby('Event').rolling(window).mean()