Python循环计算时间差
Python loop calculate time difference
我有以下数据
data = {'timestamp': ['Friday, October 15, 2021 3:40 PM', 'Oct 15, 2021 03:06:29 PM', 'Friday, October 15, 2021 2:28 PM', 'Oct 15, 2021 06:23:51 AM', 'Oct 15, 2021 04:19:07 AM', 'Oct 15, 2021 08:19:07 AM'],
'emailuser': ['michael@google.com', 'caron@yt.com', 'luke@yt.com', 'sav@google.com','sav@google.com', 'paul@yt.com']
}
data = pd.DataFrame(data)
print(data)
我想计算 Google 员工的平均响应时间。所以在这种情况下,我想得到
之间的时差
- michael@google.com - luke@yt.com(可以跳过caron@yt.com的时间戳,因为caron和luke在同一家公司)
- sav@google.com - paul@yt.com 被忽略,因为它会导致负时差
嗯,这不是很漂亮,但这应该符合规格。如果您有任何疑问或发现与规格有任何差异,请发表评论。
代码
import datetime
import pandas as pd
data = {
"timestamp": [
"Friday, October 15, 2021 3:40 PM",
"Oct 15, 2021 03:06:29 PM",
"Friday, October 15, 2021 2:28 PM",
"Oct 15, 2021 06:23:51 AM",
"Oct 15, 2021 04:19:07 AM",
"Oct 15, 2021 08:19:07 AM",
],
"emailuser": [
"michael@google.com",
"caron@yt.com",
"luke@yt.com",
"sav@google.com",
"sav@google.com",
"paul@yt.com",
],
}
def extract_datetime(timestamp: str) -> datetime.datetime:
for format in ["%A, %B %d, %Y %I:%M %p", "%b %d, %Y %I:%M:%S %p"]:
try:
return datetime.datetime.strptime(timestamp, format)
except:
pass
raise ValueError(f"Timestamp {timestamp} is invalid")
data = pd.DataFrame(data)
data["timestamp"] = data["timestamp"].apply(extract_datetime)
data["delta"] = datetime.timedelta(seconds=0)
gem = "@google.com"
for i in data.index:
if data["emailuser"].iat[i][-len(gem) :] == gem:
for j in range(i+1, data.index[-1]):
if data["emailuser"].iat[j][-len(gem) :] != gem:
delta = data["timestamp"].iat[i] - data["timestamp"].iat[j]
if delta > datetime.timedelta(0):
data["delta"].iat[i] = delta
else:
break
print(data)
输出
timestamp emailuser delta
0 2021-10-15 15:40:00 michael@google.com 0 days 01:12:00
1 2021-10-15 15:06:29 caron@yt.com 0 days 00:00:00
2 2021-10-15 14:28:00 luke@yt.com 0 days 00:00:00
3 2021-10-15 06:23:51 sav@google.com 0 days 00:00:00
4 2021-10-15 04:19:07 sav@google.com 0 days 00:00:00
5 2021-10-15 08:19:07 paul@yt.com 0 days 00:00:00
我有以下数据
data = {'timestamp': ['Friday, October 15, 2021 3:40 PM', 'Oct 15, 2021 03:06:29 PM', 'Friday, October 15, 2021 2:28 PM', 'Oct 15, 2021 06:23:51 AM', 'Oct 15, 2021 04:19:07 AM', 'Oct 15, 2021 08:19:07 AM'],
'emailuser': ['michael@google.com', 'caron@yt.com', 'luke@yt.com', 'sav@google.com','sav@google.com', 'paul@yt.com']
}
data = pd.DataFrame(data)
print(data)
我想计算 Google 员工的平均响应时间。所以在这种情况下,我想得到
之间的时差- michael@google.com - luke@yt.com(可以跳过caron@yt.com的时间戳,因为caron和luke在同一家公司)
- sav@google.com - paul@yt.com 被忽略,因为它会导致负时差
嗯,这不是很漂亮,但这应该符合规格。如果您有任何疑问或发现与规格有任何差异,请发表评论。
代码
import datetime
import pandas as pd
data = {
"timestamp": [
"Friday, October 15, 2021 3:40 PM",
"Oct 15, 2021 03:06:29 PM",
"Friday, October 15, 2021 2:28 PM",
"Oct 15, 2021 06:23:51 AM",
"Oct 15, 2021 04:19:07 AM",
"Oct 15, 2021 08:19:07 AM",
],
"emailuser": [
"michael@google.com",
"caron@yt.com",
"luke@yt.com",
"sav@google.com",
"sav@google.com",
"paul@yt.com",
],
}
def extract_datetime(timestamp: str) -> datetime.datetime:
for format in ["%A, %B %d, %Y %I:%M %p", "%b %d, %Y %I:%M:%S %p"]:
try:
return datetime.datetime.strptime(timestamp, format)
except:
pass
raise ValueError(f"Timestamp {timestamp} is invalid")
data = pd.DataFrame(data)
data["timestamp"] = data["timestamp"].apply(extract_datetime)
data["delta"] = datetime.timedelta(seconds=0)
gem = "@google.com"
for i in data.index:
if data["emailuser"].iat[i][-len(gem) :] == gem:
for j in range(i+1, data.index[-1]):
if data["emailuser"].iat[j][-len(gem) :] != gem:
delta = data["timestamp"].iat[i] - data["timestamp"].iat[j]
if delta > datetime.timedelta(0):
data["delta"].iat[i] = delta
else:
break
print(data)
输出
timestamp emailuser delta
0 2021-10-15 15:40:00 michael@google.com 0 days 01:12:00
1 2021-10-15 15:06:29 caron@yt.com 0 days 00:00:00
2 2021-10-15 14:28:00 luke@yt.com 0 days 00:00:00
3 2021-10-15 06:23:51 sav@google.com 0 days 00:00:00
4 2021-10-15 04:19:07 sav@google.com 0 days 00:00:00
5 2021-10-15 08:19:07 paul@yt.com 0 days 00:00:00