Pandas:计算满足条件的列之间的时间
Pandas: Calculate the time between columns for when a condition is satisfied
我想计算自上次故障发生以来的天数。
我的 table 具有日期时间格式的 日期列(日),以及 故障次数列 。
print (df)
Day Number of breakdowns
0 2017-01-09 1.0
1 2017-01-12 0.0
2 2017-01-13 0.0
3 2017-01-14 0.0
4 2017-01-16 3.0
5 2017-01-17 0.0
6 2017-01-18 0.0
7 2017-01-19 1.0
8 2017-01-20 0.0
9 2017-01-21 0.0
10 2017-01-23 1.0
首先比较Number of breakdowns
与不等于ne
with cumulative sum by cumsum
for transform first
value per group, so possible subtract and convert timedeltas to days
:
df['Day'] = pd.to_datetime(df['Day'])
s = df.groupby(df['Number of breakdowns'].ne(0).cumsum())['Day'].transform('first')
df['New'] = (df['Day'] - s).dt.days
print (df)
Day Number of breakdowns New
0 2017-01-09 1.0 0
1 2017-01-12 0.0 3
2 2017-01-13 0.0 4
3 2017-01-14 0.0 5
4 2017-01-16 3.0 0
5 2017-01-17 0.0 1
6 2017-01-18 0.0 2
7 2017-01-19 1.0 0
8 2017-01-20 0.0 1
9 2017-01-21 0.0 2
10 2017-01-23 1.0 0
我想计算自上次故障发生以来的天数。 我的 table 具有日期时间格式的 日期列(日),以及 故障次数列 。
print (df)
Day Number of breakdowns
0 2017-01-09 1.0
1 2017-01-12 0.0
2 2017-01-13 0.0
3 2017-01-14 0.0
4 2017-01-16 3.0
5 2017-01-17 0.0
6 2017-01-18 0.0
7 2017-01-19 1.0
8 2017-01-20 0.0
9 2017-01-21 0.0
10 2017-01-23 1.0
首先比较Number of breakdowns
与不等于ne
with cumulative sum by cumsum
for transform first
value per group, so possible subtract and convert timedeltas to days
:
df['Day'] = pd.to_datetime(df['Day'])
s = df.groupby(df['Number of breakdowns'].ne(0).cumsum())['Day'].transform('first')
df['New'] = (df['Day'] - s).dt.days
print (df)
Day Number of breakdowns New
0 2017-01-09 1.0 0
1 2017-01-12 0.0 3
2 2017-01-13 0.0 4
3 2017-01-14 0.0 5
4 2017-01-16 3.0 0
5 2017-01-17 0.0 1
6 2017-01-18 0.0 2
7 2017-01-19 1.0 0
8 2017-01-20 0.0 1
9 2017-01-21 0.0 2
10 2017-01-23 1.0 0