用 pandas 数据帧中的均值计算时间差?
Calculate time difference with mean in pandas dataframe?
让我们取任意两个日期时间列,我想计算以下公式以获得平均值。
mean(24*(closed_time - created_time ))
在 excel 中,我尝试应用相同的逻辑,并获得以下值,
closed time created date mean(24*(closed_time - created_time ))
5/14/2022 8:35 5/11/2022 1:08 79.45
5/14/2022 8:12 5/13/2022 8:45 23.45
5/14/2022 8:34 5/13/2022 11:47 20.78333333
5/11/2022 11:21 5/9/2022 16:43 42.63333333
5/11/2022 11:30 5/8/2022 19:51 63.65
5/11/2022 11:22 5/6/2022 16:45 114.6166667
5/11/2022 11:25 5/9/2022 19:53 39.53333333
5/11/2022 11:28 5/9/2022 10:52 48.6
任何帮助将不胜感激!!
不确定 mean
,在相同输出的样本数据中通过将秒数转换为小时数减去列:
cols = ['closed time','created date']
df[cols] = df[cols].apply(pd.to_datetime)
df['mean1'] = df['closed time'].sub(df['created date']).dt.total_seconds().div(3600)
print (df)
closed time created date mean mean1
0 2022-05-14 08:35:00 2022-05-11 01:08:00 79.450000 79.450000
1 2022-05-14 08:12:00 2022-05-13 08:45:00 23.450000 23.450000
2 2022-05-14 08:34:00 2022-05-13 11:47:00 20.783333 20.783333
3 2022-05-11 11:21:00 2022-05-09 16:43:00 42.633333 42.633333
4 2022-05-11 11:30:00 2022-05-08 19:51:00 63.650000 63.650000
5 2022-05-11 11:22:00 2022-05-06 16:45:00 114.616667 114.616667
6 2022-05-11 11:25:00 2022-05-09 19:53:00 39.533333 39.533333
7 2022-05-11 11:28:00 2022-05-09 10:52:00 48.600000 48.600000
两个日期时间的平均值计算方式:
df['mean']=pd.to_datetime(df[['closed time','created date']].astype(np.int64).mean(axis=1))
print (df)
closed time created date mean
0 2022-05-14 08:35:00 2022-05-11 01:08:00 2022-05-12 16:51:30
1 2022-05-14 08:12:00 2022-05-13 08:45:00 2022-05-13 20:28:30
2 2022-05-14 08:34:00 2022-05-13 11:47:00 2022-05-13 22:10:30
3 2022-05-11 11:21:00 2022-05-09 16:43:00 2022-05-10 14:02:00
4 2022-05-11 11:30:00 2022-05-08 19:51:00 2022-05-10 03:40:30
5 2022-05-11 11:22:00 2022-05-06 16:45:00 2022-05-09 02:03:30
6 2022-05-11 11:25:00 2022-05-09 19:53:00 2022-05-10 15:39:00
7 2022-05-11 11:28:00 2022-05-09 10:52:00 2022-05-10 11:10:00
让我们取任意两个日期时间列,我想计算以下公式以获得平均值。
mean(24*(closed_time - created_time ))
在 excel 中,我尝试应用相同的逻辑,并获得以下值,
closed time created date mean(24*(closed_time - created_time ))
5/14/2022 8:35 5/11/2022 1:08 79.45
5/14/2022 8:12 5/13/2022 8:45 23.45
5/14/2022 8:34 5/13/2022 11:47 20.78333333
5/11/2022 11:21 5/9/2022 16:43 42.63333333
5/11/2022 11:30 5/8/2022 19:51 63.65
5/11/2022 11:22 5/6/2022 16:45 114.6166667
5/11/2022 11:25 5/9/2022 19:53 39.53333333
5/11/2022 11:28 5/9/2022 10:52 48.6
任何帮助将不胜感激!!
不确定 mean
,在相同输出的样本数据中通过将秒数转换为小时数减去列:
cols = ['closed time','created date']
df[cols] = df[cols].apply(pd.to_datetime)
df['mean1'] = df['closed time'].sub(df['created date']).dt.total_seconds().div(3600)
print (df)
closed time created date mean mean1
0 2022-05-14 08:35:00 2022-05-11 01:08:00 79.450000 79.450000
1 2022-05-14 08:12:00 2022-05-13 08:45:00 23.450000 23.450000
2 2022-05-14 08:34:00 2022-05-13 11:47:00 20.783333 20.783333
3 2022-05-11 11:21:00 2022-05-09 16:43:00 42.633333 42.633333
4 2022-05-11 11:30:00 2022-05-08 19:51:00 63.650000 63.650000
5 2022-05-11 11:22:00 2022-05-06 16:45:00 114.616667 114.616667
6 2022-05-11 11:25:00 2022-05-09 19:53:00 39.533333 39.533333
7 2022-05-11 11:28:00 2022-05-09 10:52:00 48.600000 48.600000
两个日期时间的平均值计算方式:
df['mean']=pd.to_datetime(df[['closed time','created date']].astype(np.int64).mean(axis=1))
print (df)
closed time created date mean
0 2022-05-14 08:35:00 2022-05-11 01:08:00 2022-05-12 16:51:30
1 2022-05-14 08:12:00 2022-05-13 08:45:00 2022-05-13 20:28:30
2 2022-05-14 08:34:00 2022-05-13 11:47:00 2022-05-13 22:10:30
3 2022-05-11 11:21:00 2022-05-09 16:43:00 2022-05-10 14:02:00
4 2022-05-11 11:30:00 2022-05-08 19:51:00 2022-05-10 03:40:30
5 2022-05-11 11:22:00 2022-05-06 16:45:00 2022-05-09 02:03:30
6 2022-05-11 11:25:00 2022-05-09 19:53:00 2022-05-10 15:39:00
7 2022-05-11 11:28:00 2022-05-09 10:52:00 2022-05-10 11:10:00