当您有重复的日期时滚动 1 天
Roll over 1 day when you have repeated dates
我有这组数据:
df = pd.DataFrame()
df['Date'] = ["29/07/2021", "29/07/2021", "29/07/2021", "29/07/2021", "30/07/2021", "30/07/2021", "30/07/2021", "30/07/2021", "31/07/2021", "31/07/2021", "01/08/2021", "01/08/2021", "02/08/2021"]
df['Time'] = ["06:48:00", "06:53:00", "06:56:00", "06:59:00", "07:14:00", "07:18:00", "07:40:00", "08:12:00", "08:42:00", "08:57:00", "05:45:00", "05:55:00", "06:05:00"]
df["Column1"] = [0.011534891, 0.013458399, 0.017792937, 0.018807581, 0.025931434, 0.025163517, 0.026561283, 0.027743659, 0.028854, 0.000383506, 0.000543031, 0.000342, 0.000313769]
我想滚动 1 天并找到“Column1”的平均值。
我试过这段代码:
df.index = df["Date"]
df['mean'] = df["Column1"].rolling(1440, min_periods=1).mean()
print(df['mean'])
output:
Date
29/07/2021 0.011535
29/07/2021 0.012497
29/07/2021 0.014262
29/07/2021 0.015398
30/07/2021 0.017505
30/07/2021 0.018781
30/07/2021 0.019893
30/07/2021 0.020874
31/07/2021 0.021761
31/07/2021 0.019623
01/08/2021 0.017889
01/08/2021 0.016426
02/08/2021 0.015187
如您所见,日期重复,这是不正确的。
我希望有这样的东西(我想要这些值的平均值,我只是用 mean 告诉你我需要什么,但它只是一个值作为平均一个):
Date
29/07/2021 mean(0.011535, 0.012497, 0.014262, 0.015398)
30/07/2021 mean(0.017505, 0.018781, 0.019893, 0.020874)
31/07/2021 mean(0.021761, 0.019623)
01/08/2021 mean(0.017889, 0.016426)
02/08/2021 0.015187
我也试过这段代码:
df['DateTime'] = pd.to_datetime(df['Date'], dayfirst=True) + pd.to_timedelta(df['Time'])
df.index = df['DateTime']
df['mean'] = df["Column1"].rolling('1D').mean()
print(df['mean'])
output :
DateTime
2021-07-29 06:48:00 0.011535
2021-07-29 06:53:00 0.012497
2021-07-29 06:56:00 0.014262
2021-07-29 06:59:00 0.015398
2021-07-30 07:14:00 0.025931
2021-07-30 07:18:00 0.025547
2021-07-30 07:40:00 0.025885
2021-07-30 08:12:00 0.026350
2021-07-31 08:42:00 0.028854
2021-07-31 08:57:00 0.014619
2021-08-01 05:45:00 0.009927
2021-08-01 05:55:00 0.007531
2021-08-02 06:05:00 0.000314
但我得到了不同的结果,仍然不是我想要的结果。
而不是 rolling
,所需的输出要求 groupby
+ mean
:
out = df.groupby('Date', as_index=False)['Column1'].mean()
输出:
Date Column1
0 01/08/2021 0.000443
1 02/08/2021 0.000314
2 29/07/2021 0.015398
3 30/07/2021 0.026350
4 31/07/2021 0.014619
我有这组数据:
df = pd.DataFrame()
df['Date'] = ["29/07/2021", "29/07/2021", "29/07/2021", "29/07/2021", "30/07/2021", "30/07/2021", "30/07/2021", "30/07/2021", "31/07/2021", "31/07/2021", "01/08/2021", "01/08/2021", "02/08/2021"]
df['Time'] = ["06:48:00", "06:53:00", "06:56:00", "06:59:00", "07:14:00", "07:18:00", "07:40:00", "08:12:00", "08:42:00", "08:57:00", "05:45:00", "05:55:00", "06:05:00"]
df["Column1"] = [0.011534891, 0.013458399, 0.017792937, 0.018807581, 0.025931434, 0.025163517, 0.026561283, 0.027743659, 0.028854, 0.000383506, 0.000543031, 0.000342, 0.000313769]
我想滚动 1 天并找到“Column1”的平均值。
我试过这段代码:
df.index = df["Date"]
df['mean'] = df["Column1"].rolling(1440, min_periods=1).mean()
print(df['mean'])
output:
Date
29/07/2021 0.011535
29/07/2021 0.012497
29/07/2021 0.014262
29/07/2021 0.015398
30/07/2021 0.017505
30/07/2021 0.018781
30/07/2021 0.019893
30/07/2021 0.020874
31/07/2021 0.021761
31/07/2021 0.019623
01/08/2021 0.017889
01/08/2021 0.016426
02/08/2021 0.015187
如您所见,日期重复,这是不正确的。
我希望有这样的东西(我想要这些值的平均值,我只是用 mean 告诉你我需要什么,但它只是一个值作为平均一个):
Date
29/07/2021 mean(0.011535, 0.012497, 0.014262, 0.015398)
30/07/2021 mean(0.017505, 0.018781, 0.019893, 0.020874)
31/07/2021 mean(0.021761, 0.019623)
01/08/2021 mean(0.017889, 0.016426)
02/08/2021 0.015187
我也试过这段代码:
df['DateTime'] = pd.to_datetime(df['Date'], dayfirst=True) + pd.to_timedelta(df['Time'])
df.index = df['DateTime']
df['mean'] = df["Column1"].rolling('1D').mean()
print(df['mean'])
output :
DateTime
2021-07-29 06:48:00 0.011535
2021-07-29 06:53:00 0.012497
2021-07-29 06:56:00 0.014262
2021-07-29 06:59:00 0.015398
2021-07-30 07:14:00 0.025931
2021-07-30 07:18:00 0.025547
2021-07-30 07:40:00 0.025885
2021-07-30 08:12:00 0.026350
2021-07-31 08:42:00 0.028854
2021-07-31 08:57:00 0.014619
2021-08-01 05:45:00 0.009927
2021-08-01 05:55:00 0.007531
2021-08-02 06:05:00 0.000314
但我得到了不同的结果,仍然不是我想要的结果。
而不是 rolling
,所需的输出要求 groupby
+ mean
:
out = df.groupby('Date', as_index=False)['Column1'].mean()
输出:
Date Column1
0 01/08/2021 0.000443
1 02/08/2021 0.000314
2 29/07/2021 0.015398
3 30/07/2021 0.026350
4 31/07/2021 0.014619