Python shift() 来自与日期相同的 Excel 中的同一列

Python shift() from same column like in Excel with dates

我想在 python 中创建 'target_start' 列:

id start end diff target_start
12220 1999-11-22 2008-08-31 3515 1999-11-22
12220 2018-04-16 2019-09-15 1 2018-04-16
12220 2019-09-16 2019-11-30 1 2018-04-16
12220 2019-12-01 2020-03-31 1 2018-04-16
12220 2020-04-01 2020-06-30 -711 2018-04-16
11132 2018-07-20 2019-09-15 1 2018-07-20
11132 2019-09-16 2021-01-01 -44197 2018-07-20

这在Excel中很容易解决:

但我不知道,我如何在 pyton 中执行此操作:第一个目标行是“1”,然后:

df.loc[df.index==0,'target_start']= df['start']

我试过这段代码,但没有用:

import pandas as pd
df=pd.read_excel('./Shift.xlsx')

#if id != id.shift(1) then target_start = start
df.loc[df['id'] != df['id'].shift(1), 'target_start'] = df['start']

#elif: diff != 1 then target_start = start
df.loc[df['diff'].shift(1) != 1, 'target_start'] = df['start']

#else: target_start = target_start.shift(1)
df.loc[(df.index != 0) & (df['id'] == df['id'].shift(1)) & (df['diff'].shift(1) == 1), 'target_start']=df['target_start'].shift(1)

print(df)

结果是:

id start end diff target_start
12220 1999-11-22 2008-08-31 3515 1999-11-22
12220 2018-04-16 2019-09-15 1 2018-04-16
12220 2019-09-16 2019-11-30 1 2018-04-16
12220 2019-12-01 2020-03-31 1 NaT
12220 2020-04-01 2020-06-30 -711 NaT
11132 2018-07-20 2019-09-15 1 2018-07-20
11132 2019-09-16 2021-01-01 -44197 2018-07-20

有人知道怎么解决吗?提前致谢!

以下是我将如何实施您的 excel 公式(您突出显示的公式):

df.start = pd.to_datetime(df.start)
df.end = pd.to_datetime(df.end)
df.target_start = pd.to_datetime(df.target_start)

df["id_shift"] = df.id.shift()

target_start = [df.iloc[0, 1]]

for i in range(1, df.shape[0]):
    print(i)
    if df.iloc[i, 0] != df.iloc[i - 1, 0]:
        target_start.append(df.iloc[i, 1])
    else:
        if df.iloc[i, 3] == 1:
            target_start.append(df.iloc[i, 1])
        else:
            target_start.append(target_start[i - 1])


df["target_start"] = target_start
del df["id_shift"]

它生成以下结果:

id  start   end         diff                 target_start
0   12220   1999-11-22  2008-08-31  3515    1999-11-22
1   12220   2018-04-16  2019-09-15  1       2018-04-16
2   12220   2019-09-16  2019-11-30  1       2019-09-16
3   12220   2019-12-01  2020-03-31  1       2019-12-01
4   12220   2020-04-01  2020-06-30  -711    2019-12-01
5   11132   2018-07-20  2019-09-15  1       2018-07-20
6   11132   2019-09-16  2021-01-01  -44197  2018-07-20

谢谢@quest! 太棒了:)

我先解决了一个问题:

        else:
            if df.iloc[i-1, 3] != 1:
                target_start.append(df.iloc[i, 1])

所以完美的代码是:

df.start = pd.to_datetime(df.start)
df.end = pd.to_datetime(df.end)
df.target_start = pd.to_datetime(df.target_start)

df["id_shift"] = df.id.shift()

target_start = [df.iloc[0, 1]]

for i in range(1, df.shape[0]):
    #print(i)
    if df.iloc[i, 0] != df.iloc[i - 1, 0]:
        target_start.append(df.iloc[i, 1])
    else:
        if df.iloc[i-1, 3] != 1:
            target_start.append(df.iloc[i, 1])
        else:
            target_start.append(target_start[i - 1])


df["target_start"] = target_start
del df["id_shift"]
df.head(7)

再次感谢!你帮了大忙。