如何计算日期前 x 天的索引平均值（如果该天不是假期）并将其合并到数据框？

Question

我有一个数据集，其中包含给定日期某个位置的交通指数。对于给定日期，我想计算给定日期前 30 天的所有流量指数的平均值，如果该天不是假期，则只考虑这 30 天子集中的天数。

我想使用 python 进行此计算。我在下面有一个屏幕截图，直观地代表了我的要求。

Explanation of the screenshot

On April 1, 2019: 
I want to calculate the 30 Day Non-Holiday traffic Index Average,
for a given location and map it to a new column with a similar column name.

The column weekend_holiday is a boolean column that is true (1) for days that are public holidays or weekends. 
We must ignore such entries in the computation of Average Location's Traffic index.

Link 到示例数据集：https://gist.github.com/skwolvie/f01c027de0816c28337870286ee61a9d

请建议 python pandas 技巧来实现此结果。

Answer 1

您可以使用 pandas' rolling 计算滚动平均值，它接受 windows 和基于时间的长度。

以下代码计算数据帧每一行的平均值：

# Set date as index because it is needed if you want to do time-based rolling
df.Date = pd.to_datetime(df.Date)
df = df.set_index('Date')

# Drop weekends/holidays and then compute the average of the previous 30 days
df['DELHI'] = df.where(df.weekend_or_holiday == 0).rolling('30D').mean()['New Delhi']
df['MUMBAI'] = df.where(df.weekend_or_holiday == 0).rolling('30D').mean()['Mumbai']

# Get back Date column
df = df.reset_index()

如何计算日期前 x 天的索引平均值（如果该天不是假期）并将其合并到数据框？

How to compute avg of an index for x days before a date (if the day is not a holiday) and merge it to dataframe?

python

numpy

data-manipulation

dataframe

pandas