累积记录的每日和每小时耗电量不匹配
Daily and hourly power consumption from accumulated record make mismatch
我正在计算一个功率表的每小时和每天的数据,累计记录能耗如下:
Device
Time
kWH
Meter 1
12 May 2022 21:05:00
900
Meter 1
12 May 2022 21:20:00
930
Meter 1
12 May 2022 21:55:00
950
Meter 1
12 May 2022 22:05:00
1000
Meter 1
12 May 2022 22:55:00
1050
Meter 1
13 May 2022 00:05:00
1200
我尝试按日期时间和日期分组。但是之后的数据看起来没有意义,如下所示:
Hourly report:
Meter 1|12 May 2022 21:00:00 |50 (950-900)
Meter 1|12 May 2022 22:00:00 |50 (1050-100)
Meter 1|13 May 2022 00:00:00 |0 (only 1 data)
Daily report:
Meter 1|12 May 2022 |150 (1050-900)
Meter 1|13 May 2022 |0 (only 1 data)
-> 2022 年 5 月 12 日的每小时和每天不相等
所以我想找到一种方法来计算下面的预期数据:
Hourly report:
Meter 1|12 May 2022 21:00:00 |50 (950-900)
Meter 1|12 May 2022 22:00:00 |100 (1050-950)
Meter 1|13 May 2022 00:00:00 |150 (1200-1050)
Daily report:
Meter 1|12 May 2022 |150 (1050-900)
Meter 1|13 May 2022 |150 (1200-1050)
我希望从新的data/last小时数据,新的data/last天数据中找出解决不匹配问题的方法。
目前我正在使用 python 和 pandas。
想法是根据 Device
和 hours/days 在 Grouper
with GroupBy.first
and GroupBy.last
中汇总,根据 Device
获取差异并通过减去最后一个值和第一个值替换第一个值:
df['Time'] = pd.to_datetime(df['Time'])
df1 = df.groupby(['Device', pd.Grouper(freq='H', key='Time')])['kWH'].agg(['first','last'])
df1 = df1.groupby(level=0)['last'].diff().fillna(df1['last'].sub(df1['first'])).reset_index(name='hour diff')
print (df1)
Device Time hour diff
0 Meter 1 2022-05-12 21:00:00 50.0
1 Meter 1 2022-05-12 22:00:00 100.0
2 Meter 1 2022-05-13 00:00:00 150.0
df2 = df.groupby(['Device', pd.Grouper(freq='D', key='Time')])['kWH'].agg(['first','last'])
df2 = df2.groupby(level=0)['last'].diff().fillna(df2['last'].sub(df2['first'])).reset_index(name='day diff')
print (df2)
Device Time day diff
0 Meter 1 2022-05-12 150.0
1 Meter 1 2022-05-13 150.0
我正在计算一个功率表的每小时和每天的数据,累计记录能耗如下:
Device | Time | kWH |
---|---|---|
Meter 1 | 12 May 2022 21:05:00 | 900 |
Meter 1 | 12 May 2022 21:20:00 | 930 |
Meter 1 | 12 May 2022 21:55:00 | 950 |
Meter 1 | 12 May 2022 22:05:00 | 1000 |
Meter 1 | 12 May 2022 22:55:00 | 1050 |
Meter 1 | 13 May 2022 00:05:00 | 1200 |
我尝试按日期时间和日期分组。但是之后的数据看起来没有意义,如下所示:
Hourly report:
Meter 1|12 May 2022 21:00:00 |50 (950-900)
Meter 1|12 May 2022 22:00:00 |50 (1050-100)
Meter 1|13 May 2022 00:00:00 |0 (only 1 data)
Daily report:
Meter 1|12 May 2022 |150 (1050-900)
Meter 1|13 May 2022 |0 (only 1 data)
-> 2022 年 5 月 12 日的每小时和每天不相等
所以我想找到一种方法来计算下面的预期数据:
Hourly report:
Meter 1|12 May 2022 21:00:00 |50 (950-900)
Meter 1|12 May 2022 22:00:00 |100 (1050-950)
Meter 1|13 May 2022 00:00:00 |150 (1200-1050)
Daily report:
Meter 1|12 May 2022 |150 (1050-900)
Meter 1|13 May 2022 |150 (1200-1050)
我希望从新的data/last小时数据,新的data/last天数据中找出解决不匹配问题的方法。
目前我正在使用 python 和 pandas。
想法是根据 Device
和 hours/days 在 Grouper
with GroupBy.first
and GroupBy.last
中汇总,根据 Device
获取差异并通过减去最后一个值和第一个值替换第一个值:
df['Time'] = pd.to_datetime(df['Time'])
df1 = df.groupby(['Device', pd.Grouper(freq='H', key='Time')])['kWH'].agg(['first','last'])
df1 = df1.groupby(level=0)['last'].diff().fillna(df1['last'].sub(df1['first'])).reset_index(name='hour diff')
print (df1)
Device Time hour diff
0 Meter 1 2022-05-12 21:00:00 50.0
1 Meter 1 2022-05-12 22:00:00 100.0
2 Meter 1 2022-05-13 00:00:00 150.0
df2 = df.groupby(['Device', pd.Grouper(freq='D', key='Time')])['kWH'].agg(['first','last'])
df2 = df2.groupby(level=0)['last'].diff().fillna(df2['last'].sub(df2['first'])).reset_index(name='day diff')
print (df2)
Device Time day diff
0 Meter 1 2022-05-12 150.0
1 Meter 1 2022-05-13 150.0