Pandas 计算满足条件时列之间的时间差
Pandas calculate the time difference between columns for when a condition is satisfied
我有一个 pandas df,它有 2 列 Day(日期时间格式的日期)和 故障次数 .
我想创建两个新列,第一个是 上一个 故障事件的天数(我已经完成了),第二个是 [=24] 的天数=]下一个故障事件(我很难做到)。
Day Number of breakdowns Days from Previous Breakdown Event
2017-01-09 0.0 0
2017-01-12 0.0 0
2017-01-13 0.0 0
2017-01-14 0.0 0
2017-01-16 1.0 0
2017-01-17 0.0 1
2017-01-18 0.0 2
2017-01-19 1.0 0
2017-01-20 0.0 1
2017-01-21 0.0 2
2017-01-23 1.0 0
上一次故障事件后的天数计算发生故障后经过的天数。
代码:
s = df.groupby(df['Number of breakdowns'].ne(0).cumsum())['Day'].transform('first')
df['Days from Previous Breakdown Event'] = (df['Day'] - s).dt.days
zeros_index = df['Number of breakdowns'].ne(0).idxmax()
df.loc[:zeros_index,'Days from Previous Breakdown Event'] = 0
我需要帮助来完成“从下一个开始的天数”细分列,我希望它看起来像这样:
Day Number of breakdowns Days from Next Breakdown Event
2017-01-09 0.0 7
2017-01-12 0.0 4
2017-01-13 0.0 3
2017-01-14 0.0 2
2017-01-16 1.0 0
2017-01-17 0.0 2
2017-01-18 0.0 1
2017-01-19 1.0 0
2017-01-20 0.0 3
2017-01-21 0.0 2
2017-01-23 1.0 0
将 iloc[::-1]
的顺序与 transform
和 last
交换,然后也交换 s - df['Day']
:
s = df.groupby(df['Number of breakdowns'].iloc[::-1].ne(0).cumsum())['Day'].transform('last')
df['Days from Next Breakdown Event'] = (s - df['Day']).dt.days
print (df)
Day Number of breakdowns Days from Previous Breakdown Event \
0 2017-01-09 0.0 0
1 2017-01-12 0.0 0
2 2017-01-13 0.0 0
3 2017-01-14 0.0 0
4 2017-01-16 1.0 0
5 2017-01-17 0.0 1
6 2017-01-18 0.0 2
7 2017-01-19 1.0 0
8 2017-01-20 0.0 1
9 2017-01-21 0.0 2
10 2017-01-23 1.0 0
Days from Next Breakdown Event
0 7
1 4
2 3
3 2
4 0
5 2
6 1
7 0
8 3
9 2
10 0
详情:
print (s)
0 2017-01-16
1 2017-01-16
2 2017-01-16
3 2017-01-16
4 2017-01-16
5 2017-01-19
6 2017-01-19
7 2017-01-19
8 2017-01-23
9 2017-01-23
10 2017-01-23
Name: Day, dtype: datetime64[ns]
我有一个 pandas df,它有 2 列 Day(日期时间格式的日期)和 故障次数 .
我想创建两个新列,第一个是 上一个 故障事件的天数(我已经完成了),第二个是 [=24] 的天数=]下一个故障事件(我很难做到)。
Day Number of breakdowns Days from Previous Breakdown Event
2017-01-09 0.0 0
2017-01-12 0.0 0
2017-01-13 0.0 0
2017-01-14 0.0 0
2017-01-16 1.0 0
2017-01-17 0.0 1
2017-01-18 0.0 2
2017-01-19 1.0 0
2017-01-20 0.0 1
2017-01-21 0.0 2
2017-01-23 1.0 0
上一次故障事件后的天数计算发生故障后经过的天数。
代码:
s = df.groupby(df['Number of breakdowns'].ne(0).cumsum())['Day'].transform('first')
df['Days from Previous Breakdown Event'] = (df['Day'] - s).dt.days
zeros_index = df['Number of breakdowns'].ne(0).idxmax()
df.loc[:zeros_index,'Days from Previous Breakdown Event'] = 0
我需要帮助来完成“从下一个开始的天数”细分列,我希望它看起来像这样:
Day Number of breakdowns Days from Next Breakdown Event
2017-01-09 0.0 7
2017-01-12 0.0 4
2017-01-13 0.0 3
2017-01-14 0.0 2
2017-01-16 1.0 0
2017-01-17 0.0 2
2017-01-18 0.0 1
2017-01-19 1.0 0
2017-01-20 0.0 3
2017-01-21 0.0 2
2017-01-23 1.0 0
将 iloc[::-1]
的顺序与 transform
和 last
交换,然后也交换 s - df['Day']
:
s = df.groupby(df['Number of breakdowns'].iloc[::-1].ne(0).cumsum())['Day'].transform('last')
df['Days from Next Breakdown Event'] = (s - df['Day']).dt.days
print (df)
Day Number of breakdowns Days from Previous Breakdown Event \
0 2017-01-09 0.0 0
1 2017-01-12 0.0 0
2 2017-01-13 0.0 0
3 2017-01-14 0.0 0
4 2017-01-16 1.0 0
5 2017-01-17 0.0 1
6 2017-01-18 0.0 2
7 2017-01-19 1.0 0
8 2017-01-20 0.0 1
9 2017-01-21 0.0 2
10 2017-01-23 1.0 0
Days from Next Breakdown Event
0 7
1 4
2 3
3 2
4 0
5 2
6 1
7 0
8 3
9 2
10 0
详情:
print (s)
0 2017-01-16
1 2017-01-16
2 2017-01-16
3 2017-01-16
4 2017-01-16
5 2017-01-19
6 2017-01-19
7 2017-01-19
8 2017-01-23
9 2017-01-23
10 2017-01-23
Name: Day, dtype: datetime64[ns]