如何找到 pandas 中两个日期时间之间的差异?
How to find difference between two date time in pandas?
我有以下数据类型:
id=["Train A","Train A","Train A","Train B","Train B","Train B"]
arrival_time = ["0"," 2016-05-19 13:50:00","2016-05-19 21:25:00","0","2016-05-24 18:30:00","2016-05-26 12:15:00"]
departure_time = ["2016-05-19 08:25:00","2016-05-19 16:00:00","2016-05-20 07:45:00","2016-05-24 12:50:00","2016-05-25 23:00:00","2016-05-26 19:45:00"]
获取以下数据:
id arrival_time departure_time
Train A 0 2016-05-19 08:25:00
Train A 2016-05-19 13:50:00 2016-05-19 16:00:00
Train A 2016-05-19 21:25:00 2016-05-20 07:45:00
Train B 0 2016-05-24 12:50:00
Train B 2016-05-24 18:30:00 2016-05-25 23:00:00
Train B 2016-05-26 12:15:00 2016-05-26 19:45:00
出发时间和到达时间的数据类型为datetime64[ns]。
如何计算第一排出发时间和第二排到达时间的时差?我厌倦了以下代码,但没有用。例如求 [2016-05-19 08:25:00] 和 [2016-05-19 13:50:00].
之间的时间差
df['Duration'] = df.departure_time.iloc[i+1] - df.arrival_time.iloc[i]
我认为你需要先转换 dates
字符串 to_datetime
,另外 0
必须转换为 NaN
:
df = pd.DataFrame({'id': id, 'arrival_time':arrival_time, 'departure_time':departure_time})
df['arrival_time'] = pd.to_datetime(df['arrival_time'].replace('0', np.nan))
#another solution for replace not dates to NaT
#df['arrival_time'] = pd.to_datetime(df['arrival_time'], errors='coerce')
df['departure_time'] = pd.to_datetime(df['departure_time'])
print (df)
arrival_time departure_time id
0 NaT 2016-05-19 08:25:00 Train A
1 2016-05-19 13:50:00 2016-05-19 16:00:00 Train A
2 2016-05-19 21:25:00 2016-05-20 07:45:00 Train A
3 NaT 2016-05-24 12:50:00 Train B
4 2016-05-24 18:30:00 2016-05-25 23:00:00 Train B
5 2016-05-26 12:15:00 2016-05-26 19:45:00 Train B
然后 shift
列 departure_time
每组 id
加上 groupby
并减去 arrival_time
列。
df['Duration'] = df.groupby('id')['departure_time'].shift() - df['arrival_time']
print (df)
arrival_time departure_time id Duration
0 NaT 2016-05-19 08:25:00 Train A NaT
1 2016-05-19 13:50:00 2016-05-19 16:00:00 Train A -1 days +18:35:00
2 2016-05-19 21:25:00 2016-05-20 07:45:00 Train A -1 days +18:35:00
3 NaT 2016-05-24 12:50:00 Train B NaT
4 2016-05-24 18:30:00 2016-05-25 23:00:00 Train B -1 days +18:20:00
5 2016-05-26 12:15:00 2016-05-26 19:45:00 Train B -1 days +10:45:00
或者可能需要为正时间增量交换列:
df['Duration'] = df['arrival_time'] - df.groupby('id')['departure_time'].shift()
print (df)
arrival_time departure_time id Duration
0 NaT 2016-05-19 08:25:00 Train A NaT
1 2016-05-19 13:50:00 2016-05-19 16:00:00 Train A 05:25:00
2 2016-05-19 21:25:00 2016-05-20 07:45:00 Train A 05:25:00
3 NaT 2016-05-24 12:50:00 Train B NaT
4 2016-05-24 18:30:00 2016-05-25 23:00:00 Train B 05:40:00
5 2016-05-26 12:15:00 2016-05-26 19:45:00 Train B 13:15:00
最后可以通过 total_seconds
:
将 timedelta
转换为 seconds
df['Duration'] = (df['arrival_time'] - df.groupby('id')['departure_time'].shift()).dt.total_seconds()
print (df)
arrival_time departure_time id Duration
0 NaT 2016-05-19 08:25:00 Train A NaN
1 2016-05-19 13:50:00 2016-05-19 16:00:00 Train A 19500.0
2 2016-05-19 21:25:00 2016-05-20 07:45:00 Train A 19500.0
3 NaT 2016-05-24 12:50:00 Train B NaN
4 2016-05-24 18:30:00 2016-05-25 23:00:00 Train B 20400.0
5 2016-05-26 12:15:00 2016-05-26 19:45:00 Train B 47700.0
我有以下数据类型:
id=["Train A","Train A","Train A","Train B","Train B","Train B"]
arrival_time = ["0"," 2016-05-19 13:50:00","2016-05-19 21:25:00","0","2016-05-24 18:30:00","2016-05-26 12:15:00"]
departure_time = ["2016-05-19 08:25:00","2016-05-19 16:00:00","2016-05-20 07:45:00","2016-05-24 12:50:00","2016-05-25 23:00:00","2016-05-26 19:45:00"]
获取以下数据:
id arrival_time departure_time
Train A 0 2016-05-19 08:25:00
Train A 2016-05-19 13:50:00 2016-05-19 16:00:00
Train A 2016-05-19 21:25:00 2016-05-20 07:45:00
Train B 0 2016-05-24 12:50:00
Train B 2016-05-24 18:30:00 2016-05-25 23:00:00
Train B 2016-05-26 12:15:00 2016-05-26 19:45:00
出发时间和到达时间的数据类型为datetime64[ns]。
如何计算第一排出发时间和第二排到达时间的时差?我厌倦了以下代码,但没有用。例如求 [2016-05-19 08:25:00] 和 [2016-05-19 13:50:00].
之间的时间差df['Duration'] = df.departure_time.iloc[i+1] - df.arrival_time.iloc[i]
我认为你需要先转换 dates
字符串 to_datetime
,另外 0
必须转换为 NaN
:
df = pd.DataFrame({'id': id, 'arrival_time':arrival_time, 'departure_time':departure_time})
df['arrival_time'] = pd.to_datetime(df['arrival_time'].replace('0', np.nan))
#another solution for replace not dates to NaT
#df['arrival_time'] = pd.to_datetime(df['arrival_time'], errors='coerce')
df['departure_time'] = pd.to_datetime(df['departure_time'])
print (df)
arrival_time departure_time id
0 NaT 2016-05-19 08:25:00 Train A
1 2016-05-19 13:50:00 2016-05-19 16:00:00 Train A
2 2016-05-19 21:25:00 2016-05-20 07:45:00 Train A
3 NaT 2016-05-24 12:50:00 Train B
4 2016-05-24 18:30:00 2016-05-25 23:00:00 Train B
5 2016-05-26 12:15:00 2016-05-26 19:45:00 Train B
然后 shift
列 departure_time
每组 id
加上 groupby
并减去 arrival_time
列。
df['Duration'] = df.groupby('id')['departure_time'].shift() - df['arrival_time']
print (df)
arrival_time departure_time id Duration
0 NaT 2016-05-19 08:25:00 Train A NaT
1 2016-05-19 13:50:00 2016-05-19 16:00:00 Train A -1 days +18:35:00
2 2016-05-19 21:25:00 2016-05-20 07:45:00 Train A -1 days +18:35:00
3 NaT 2016-05-24 12:50:00 Train B NaT
4 2016-05-24 18:30:00 2016-05-25 23:00:00 Train B -1 days +18:20:00
5 2016-05-26 12:15:00 2016-05-26 19:45:00 Train B -1 days +10:45:00
或者可能需要为正时间增量交换列:
df['Duration'] = df['arrival_time'] - df.groupby('id')['departure_time'].shift()
print (df)
arrival_time departure_time id Duration
0 NaT 2016-05-19 08:25:00 Train A NaT
1 2016-05-19 13:50:00 2016-05-19 16:00:00 Train A 05:25:00
2 2016-05-19 21:25:00 2016-05-20 07:45:00 Train A 05:25:00
3 NaT 2016-05-24 12:50:00 Train B NaT
4 2016-05-24 18:30:00 2016-05-25 23:00:00 Train B 05:40:00
5 2016-05-26 12:15:00 2016-05-26 19:45:00 Train B 13:15:00
最后可以通过 total_seconds
:
timedelta
转换为 seconds
df['Duration'] = (df['arrival_time'] - df.groupby('id')['departure_time'].shift()).dt.total_seconds()
print (df)
arrival_time departure_time id Duration
0 NaT 2016-05-19 08:25:00 Train A NaN
1 2016-05-19 13:50:00 2016-05-19 16:00:00 Train A 19500.0
2 2016-05-19 21:25:00 2016-05-20 07:45:00 Train A 19500.0
3 NaT 2016-05-24 12:50:00 Train B NaN
4 2016-05-24 18:30:00 2016-05-25 23:00:00 Train B 20400.0
5 2016-05-26 12:15:00 2016-05-26 19:45:00 Train B 47700.0