如何计算多个日期的值的差异?

How to calculate the difference of a value for multiple dates?

我的数据集包含多个储气值。我想将它们中的每一个与一年前确切日期的值进行比较,持续多年。这是我的数据的样子:

facility gasDayStartedOn gasInStorage full injection
UGS Haidach 2022-01-09 4.3041 37 0.00
UGS Haidach 2022-01-08 4.3263 38 0.00
UGS Haidach 2021-01-09 5.5678 43 0.00

我如何 calculate/compare gasInStorgae 每年 gasDayStartedOn 相同的设施并将其存储在同一 DataFrame 的新列中?

我写了这个code:

def det_dates(df, a_date):
    b_df = df[df.gasDayStartedOn == a_date - pd.Timedelta(days=365)]
    if b_df.shape[0] != 0:
        return b_df.full.values[0]
    return None

def get_dif(df):
    for i, r in df.iterrows():
        a_date = r.gasDayStartedOn
        a_gasInStorage = r.gasInStorage
        b_gasInStorage = det_dates(df, a_date)
        
        if b_gasInStorage:
            dif_gasInStorage = a_gasInStorage - gasInStorage
        else:
            dif_gasInStorage = None
            
        df.loc[i, 'difdif'] = dif_gasInStorage

dfs = []

for com_fac, group in tqdm(data_1.groupby(['company', 'facility'])):
    g = group.copy()
    g.sort_values('gasDayStartedOn', inplace=True, ascending=False)
    get_dif(g)
    dfs.append(g)

但是它不起作用!请帮助!这是我得到的错误:

from datetime import datetime, timedelta

如果您能提供预期的输出,您将获得更好的答案。但是在同一天检查一年与下一年之间差异的一种简单方法是使用 groupbydiff.

import pandas as pd
df = pd.read_clipboard()

df['gasDayStartedOn'] = pd.to_datetime(df.gasDayStartedOn)
df = df.sort_values(by='gasDayStartedOn', ascending=True)

group = df.groupby([df.gasDayStartedOn.dt.day, df.gasDayStartedOn.dt.month, 'facility'])

df['diff'] = group['gasInStorage'].diff()

df
Out[1]: 
      facility gasDayStartedOn  gasInStorage  full  injection    diff
2  UGS Haidach      2021-01-09        5.5678    43        0.0     NaN
1  UGS Haidach      2022-01-08        4.3263    38        0.0     NaN
0  UGS Haidach      2022-01-09        4.3041    37        0.0 -1.2637