在 pandas 数据框中查找时间变化
Find change in time in a pandas data frame
我从 Pandas 数据框中提取了以下列表。它基本上是开始日期和时间以及结束日期和时间,我想找出它们之间的区别。
start_date = ['29.12.2020', '29.12.2020', '30.12.2020', '30.12.2020', '30.12.2020']
start_time = [datetime.time(11, 10), datetime.time(23, 15), datetime.time(5, 15), datetime.time(11, 15), datetime.time(17, 15)]
end_date = ['29.12.2020', '30.12.2020', '30.12.2020', '30.12.2020', '30.12.2020']
end_time = [datetime.time(23, 15), datetime.time(5, 15), datetime.time(11, 15), datetime.time(17, 15), datetime.time(23, 15)]
所以,我想加入日期和时间以得到一个时间起点和一个终点时间,并以 HH:MM 格式找出两者之间的差异。
例如,如果我们取第一行,两个时间点之间的差异应该是12:05
(12小时5分钟)
start_date = ['29.12.2020', '29.12.2020', '30.12.2020', '30.12.2020', '30.12.2020']
start_time = [datetime.time(11, 10), datetime.time(23, 15), datetime.time(5, 15), datetime.time(11, 15), datetime.time(17, 15)]
end_date = ['29.12.2020', '30.12.2020', '30.12.2020', '30.12.2020', '30.12.2020']
end_time = [datetime.time(23, 15), datetime.time(5, 15), datetime.time(11, 15), datetime.time(17, 15), datetime.time(23, 15)]
df = pd.DataFrame(data={'start_date': start_date,
'start_time': start_time,
'end_date': end_date,
'end_time': end_time})
df['start_date_time'] = pd.to_datetime(df['start_date'] + ' ' + df['start_time'].astype(str))
df['end_date_time'] = pd.to_datetime(df['end_date'] + ' ' + df['end_time'].astype(str))
df['diff'] = (df['end_date_time'] - df['start_date_time'])
df['hours'] = df['diff']/ np.timedelta64(1, 'h')
df['HH:MM'] = df['hours'].astype(int).astype(str) + ':' + ((df['diff']/ np.timedelta64(1, 'm')) - (df['hours'].astype(int)*60)).astype(int).astype(str)
print(df[['start_date_time', 'end_date_time', 'HH:MM']])
输出:
start_date_time end_date_time HH:MM
0 2020-12-29 11:10:00 2020-12-29 23:15:00 12:5
1 2020-12-29 23:15:00 2020-12-30 05:15:00 6:0
2 2020-12-30 05:15:00 2020-12-30 11:15:00 6:0
3 2020-12-30 11:15:00 2020-12-30 17:15:00 6:0
4 2020-12-30 17:15:00 2020-12-30 23:15:00 6:0
你可以试试这个。
import datetime
start_date = ['29.12.2020', '29.12.2020', '30.12.2020', '30.12.2020', '30.12.2020']
start_time = [datetime.time(11, 10), datetime.time(23, 15), datetime.time(5, 15), datetime.time(11, 15), datetime.time(17, 15)]
end_date = ['29.12.2020', '30.12.2020', '30.12.2020', '30.12.2020', '30.12.2020']
end_time = [datetime.time(23, 15), datetime.time(5, 15), datetime.time(11, 15), datetime.time(17, 15), datetime.time(23, 15)]
for start_d, start_t, end_d, end_t in zip(start_date, start_time,end_date, end_time):
start_date_time = datetime.datetime.strptime(start_d, '%d.%m.%Y')
start_date_time = start_date_time.replace(hour=start_t.hour, minute=start_t.minute)
end_date_time = datetime.datetime.strptime(end_d, '%d.%m.%Y')
end_date_time = end_date_time.replace(hour=end_t.hour, minute=end_t.minute)
time_diff = end_date_time - start_date_time
diff_str = f"{time_diff.days}days {time_diff.seconds//3600}hours {(time_diff.seconds//60)%60}minutes "
print(diff_str)
我从 Pandas 数据框中提取了以下列表。它基本上是开始日期和时间以及结束日期和时间,我想找出它们之间的区别。
start_date = ['29.12.2020', '29.12.2020', '30.12.2020', '30.12.2020', '30.12.2020']
start_time = [datetime.time(11, 10), datetime.time(23, 15), datetime.time(5, 15), datetime.time(11, 15), datetime.time(17, 15)]
end_date = ['29.12.2020', '30.12.2020', '30.12.2020', '30.12.2020', '30.12.2020']
end_time = [datetime.time(23, 15), datetime.time(5, 15), datetime.time(11, 15), datetime.time(17, 15), datetime.time(23, 15)]
所以,我想加入日期和时间以得到一个时间起点和一个终点时间,并以 HH:MM 格式找出两者之间的差异。
例如,如果我们取第一行,两个时间点之间的差异应该是12:05
(12小时5分钟)
start_date = ['29.12.2020', '29.12.2020', '30.12.2020', '30.12.2020', '30.12.2020']
start_time = [datetime.time(11, 10), datetime.time(23, 15), datetime.time(5, 15), datetime.time(11, 15), datetime.time(17, 15)]
end_date = ['29.12.2020', '30.12.2020', '30.12.2020', '30.12.2020', '30.12.2020']
end_time = [datetime.time(23, 15), datetime.time(5, 15), datetime.time(11, 15), datetime.time(17, 15), datetime.time(23, 15)]
df = pd.DataFrame(data={'start_date': start_date,
'start_time': start_time,
'end_date': end_date,
'end_time': end_time})
df['start_date_time'] = pd.to_datetime(df['start_date'] + ' ' + df['start_time'].astype(str))
df['end_date_time'] = pd.to_datetime(df['end_date'] + ' ' + df['end_time'].astype(str))
df['diff'] = (df['end_date_time'] - df['start_date_time'])
df['hours'] = df['diff']/ np.timedelta64(1, 'h')
df['HH:MM'] = df['hours'].astype(int).astype(str) + ':' + ((df['diff']/ np.timedelta64(1, 'm')) - (df['hours'].astype(int)*60)).astype(int).astype(str)
print(df[['start_date_time', 'end_date_time', 'HH:MM']])
输出:
start_date_time end_date_time HH:MM
0 2020-12-29 11:10:00 2020-12-29 23:15:00 12:5
1 2020-12-29 23:15:00 2020-12-30 05:15:00 6:0
2 2020-12-30 05:15:00 2020-12-30 11:15:00 6:0
3 2020-12-30 11:15:00 2020-12-30 17:15:00 6:0
4 2020-12-30 17:15:00 2020-12-30 23:15:00 6:0
你可以试试这个。
import datetime
start_date = ['29.12.2020', '29.12.2020', '30.12.2020', '30.12.2020', '30.12.2020']
start_time = [datetime.time(11, 10), datetime.time(23, 15), datetime.time(5, 15), datetime.time(11, 15), datetime.time(17, 15)]
end_date = ['29.12.2020', '30.12.2020', '30.12.2020', '30.12.2020', '30.12.2020']
end_time = [datetime.time(23, 15), datetime.time(5, 15), datetime.time(11, 15), datetime.time(17, 15), datetime.time(23, 15)]
for start_d, start_t, end_d, end_t in zip(start_date, start_time,end_date, end_time):
start_date_time = datetime.datetime.strptime(start_d, '%d.%m.%Y')
start_date_time = start_date_time.replace(hour=start_t.hour, minute=start_t.minute)
end_date_time = datetime.datetime.strptime(end_d, '%d.%m.%Y')
end_date_time = end_date_time.replace(hour=end_t.hour, minute=end_t.minute)
time_diff = end_date_time - start_date_time
diff_str = f"{time_diff.days}days {time_diff.seconds//3600}hours {(time_diff.seconds//60)%60}minutes "
print(diff_str)