Pandas:计算满足条件的列之间的时间

Pandas: Calculate the time between columns for when a condition is satisfied

我想计算自上次故障发生以来的天数。 我的 table 具有日期时间格式的 日期列(日),以及 故障次数列

print (df)
          Day  Number of breakdowns
0  2017-01-09                   1.0
1  2017-01-12                   0.0
2  2017-01-13                   0.0
3  2017-01-14                   0.0
4  2017-01-16                   3.0
5  2017-01-17                   0.0
6  2017-01-18                   0.0
7  2017-01-19                   1.0
8  2017-01-20                   0.0
9  2017-01-21                   0.0
10 2017-01-23                   1.0

首先比较Number of breakdowns与不等于ne with cumulative sum by cumsum for transform first value per group, so possible subtract and convert timedeltas to days:

df['Day'] = pd.to_datetime(df['Day'])

s = df.groupby(df['Number of breakdowns'].ne(0).cumsum())['Day'].transform('first')
df['New'] = (df['Day'] - s).dt.days
print (df)
          Day  Number of breakdowns  New
0  2017-01-09                   1.0    0
1  2017-01-12                   0.0    3
2  2017-01-13                   0.0    4
3  2017-01-14                   0.0    5
4  2017-01-16                   3.0    0
5  2017-01-17                   0.0    1
6  2017-01-18                   0.0    2
7  2017-01-19                   1.0    0
8  2017-01-20                   0.0    1
9  2017-01-21                   0.0    2
10 2017-01-23                   1.0    0