累积记录的每日和每小时耗电量不匹配

Daily and hourly power consumption from accumulated record make mismatch

我正在计算一个功率表的每小时和每天的数据,累计记录能耗如下:

Device Time kWH
Meter 1 12 May 2022 21:05:00 900
Meter 1 12 May 2022 21:20:00 930
Meter 1 12 May 2022 21:55:00 950
Meter 1 12 May 2022 22:05:00 1000
Meter 1 12 May 2022 22:55:00 1050
Meter 1 13 May 2022 00:05:00 1200

我尝试按日期时间和日期分组。但是之后的数据看起来没有意义,如下所示:

Hourly report:
Meter 1|12 May 2022 21:00:00    |50 (950-900)
Meter 1|12 May 2022 22:00:00    |50 (1050-100)
Meter 1|13 May 2022 00:00:00    |0 (only 1 data)

Daily report:
Meter 1|12 May 2022         |150 (1050-900)
Meter 1|13 May 2022         |0 (only 1 data)

-> 2022 年 5 月 12 日的每小时和每天不相等

所以我想找到一种方法来计算下面的预期数据:

Hourly report:
Meter 1|12 May 2022 21:00:00    |50 (950-900)
Meter 1|12 May 2022 22:00:00    |100 (1050-950)
Meter 1|13 May 2022 00:00:00    |150 (1200-1050)

Daily report:
Meter 1|12 May 2022         |150 (1050-900)
Meter 1|13 May 2022         |150 (1200-1050)

我希望从新的data/last小时数据,新的data/last天数据中找出解决不匹配问题的方法。

目前我正在使用 python 和 pandas。

想法是根据 Device 和 hours/days 在 Grouper with GroupBy.first and GroupBy.last 中汇总,根据 Device 获取差异并通过减去最后一个值和第一个值替换第一个值:

df['Time'] = pd.to_datetime(df['Time'])

df1 = df.groupby(['Device', pd.Grouper(freq='H', key='Time')])['kWH'].agg(['first','last'])

df1 = df1.groupby(level=0)['last'].diff().fillna(df1['last'].sub(df1['first'])).reset_index(name='hour diff')
print (df1)
    Device                Time  hour diff
0  Meter 1 2022-05-12 21:00:00       50.0
1  Meter 1 2022-05-12 22:00:00      100.0
2  Meter 1 2022-05-13 00:00:00      150.0

df2 = df.groupby(['Device', pd.Grouper(freq='D', key='Time')])['kWH'].agg(['first','last'])

df2 = df2.groupby(level=0)['last'].diff().fillna(df2['last'].sub(df2['first'])).reset_index(name='day diff')
print (df2)
    Device       Time   day diff
0  Meter 1 2022-05-12      150.0
1  Meter 1 2022-05-13      150.0