如何在给定值的 n 分钟内找到列表中的所有日期时间?
How can I find all datetimes in a list within n minutes of a given value?
我有两个时间列表(时:分:秒格式),我一直在努力将 list_a
中的每个条目与 list_b
中的所有条目进行比较,以确定落在 30 以内的值分钟:
list_a = ["10:26:42", "8:55:43", "7:34:11"]
list_b = ["10:49:20", "8:51:10", "10:34:35", "8:39:47", "7:11:49", "7:42:10"]
预期输出:
10:26:42 is within 30m of 10:49:20, 10:34:35
8:55:43 is within 30m of 8:51:10, 8:39:47
7:34:11 is within 30m of 7:11:49, 7:42:10
到目前为止我一直在做的是:
import datetime
# Convert the Lists to Datetime Format
for data in list_a:
convert = datetime.datetime.strptime(data,"%H:%M:%S")
list_a_times.append(convert)
for data in list_b:
convert = datetime.datetime.strptime(data,"%H:%M:%S")
list_b_times.append(convert)
# Using a Value of List A, Find the Closest Value in List B
for data in list_a_times:
closest_to_data = min(list_b_times, key=lambda d: abs(d - data))
print(data, closest_to_data)
这种方法可行,但它只能找到一个最接近的值!我如何操作 min() 函数以在所需的 30 分钟或更短时间内继续提供值?
你在所有元素的绝对时间差异处循环并掠夺而不是使用 min
:
list_a = ["10:26:42", "8:55:43", "7:34:11"]
list_b = ["10:49:20", "8:51:10", "10:34:35", "8:39:47", "7:11:49", "7:42:10"]
import datetime
import datetime
# Convert the Lists to Datetime Format
list_a = [datetime.datetime.strptime(d,"%H:%M:%S") for d in list_a]
list_b = [datetime.datetime.strptime(d,"%H:%M:%S") for d in list_b]
for value in list_a:
for v in list_b:
if abs(value-v) < datetime.timedelta(minutes=30):
print (value, "=>", v, "diff: ", (value-v).total_seconds() // 60)
print()
输出:
1900-01-01 10:26:42 => 1900-01-01 10:49:20 diff: -23.0
1900-01-01 10:26:42 => 1900-01-01 10:34:35 diff: -8.0
1900-01-01 08:55:43 => 1900-01-01 08:51:10 diff: 4.0
1900-01-01 08:55:43 => 1900-01-01 08:39:47 diff: 15.0
1900-01-01 07:34:11 => 1900-01-01 07:11:49 diff: 22.0
1900-01-01 07:34:11 => 1900-01-01 07:42:10 diff: -8.0
对于像 0:05:00 和 23:55:00 这样的日期时间,这会出错,因为它们位于不同的日期。
您可以通过自己编写的增量计算来解决这个问题:
def abs_time_diff(dt1, dt2, *, ignore_date = False):
if not ignore_date:
return abs(dt1-dt2)
# use day before, this day and day after, report minimum
return min ( (abs(dt1 + datetime.timedelta(days = delta) - dt2)
for delta in range(-1,2)))
list_a = ["0:5:0"]
list_b = ["0:20:0", "23:55:0"]
list_a = [datetime.datetime.strptime(d,"%H:%M:%S") for d in list_a]
list_b = [datetime.datetime.strptime(d,"%H:%M:%S") for d in list_b]
for value in list_a:
for v in list_b:
print (value, v, abs_time_diff(value,v))
print (value, v, abs_time_diff(value,v, ignore_date = True))
输出:
1900-01-01 00:05:00 1900-01-01 00:20:00 0:15:00
1900-01-01 00:05:00 1900-01-01 00:20:00 0:15:00
1900-01-01 00:05:00 1900-01-01 23:55:00 23:50:00 # with date
1900-01-01 00:05:00 1900-01-01 23:55:00 0:10:00 # ignores date
IIUC,你想比较所有的组合,所以你需要检查所有。
请阅读答案末尾关于 datetime
/timedelta
的注释。
使用itertools.product
:
list_a = ['10:26:42', '8:55:43', '7:34:11']
list_b = ['10:49:20', '8:51:10', '10:34:35', '8:39:47', '7:11:49', '7:42:10']
import datetime
from itertools import product
str2time = lambda s: datetime.datetime.strptime(s, "%H:%M:%S")
for a,b in product(map(str2time, list_a), map(str2time, list_b)):
if abs(a-b).total_seconds() <= 1800:
print(f'{a:%H:%M:%S} is within 30m of {b:%H:%M:%S}')
输出:
10:26:42 is within 30m of 10:49:20
10:26:42 is within 30m of 10:34:35
08:55:43 is within 30m of 08:51:10
08:55:43 is within 30m of 08:39:47
07:34:11 is within 30m of 07:11:49
07:34:11 is within 30m of 07:42:10
使用嵌套 for 循环:
import datetime
str2time = lambda s: datetime.datetime.strptime(s, "%H:%M:%S")
for a in map(str2time, list_a):
start = f'{a:%H:%M:%S} is within 30m of'
for b in map(str2time, list_b):
if abs(a-b).total_seconds() <= 1800:
print(f'{start} {b:%H:%M:%S}', end='')
start = ','
if start == ',':
print()
输出:
10:26:42 is within 30m of 10:49:20, 10:34:35
08:55:43 is within 30m of 08:51:10, 08:39:47
07:34:11 is within 30m of 07:11:49, 07:42:10
关于 datetime
的注释
使用不带日期的 datetime
将默认为 1900-01-01,这可能会在接近午夜时产生边缘效应。相反,您可以使用 timedelta
对象。使用我的代码,您需要将 str2time
函数更改为:
def str2time(s):
h,m,s = map(int, s.split(':'))
return datetime.timedelta(hours=h, minutes=m, seconds
并稍微修改一下代码,以便能够转换为字符串:
z = datetime.datetime(1900,1,1)
for a in map(str2time, list_a):
start = f'{z+a:%H:%M:%S} is within 30m of'
for b in map(str2time, list_b):
if abs(a-b).total_seconds() <= 1800:
print(f'{start} {z+b:%H:%M:%S}', end='')
start = ','
if start == ',':
print()
我会给出一个使用 pandas
的建议:
# Convert to pandas datetime series
import pandas as pd
dt_a = pd.Series(list_a, dtype='datetime64[ns]')
dt_b = pd.Series(list_b, dtype='datetime64[ns]')
# Comparison loop
interv_size = '30m' # Thirty minutes
for el in dt_a:
hits = df_b.loc[ abs(el - df_b) < interv_size ].dt.time
print(f'{el.time()} is within {interv_size} of', *hits)
优势? 您让 python 处理日期的格式
from datetime import datetime, timedelta
list_a = ["10:26:42", "8:55:43", "7:34:11"]
list_b = ["10:49:20", "8:51:10", "10:34:35", "8:39:47", "7:11:49", "7:42:10"]
time_format = "%H:%M:%S"
def convert_to_datetime(time_str):
return datetime.strptime(time_str, time_format)
# Overriding list_a and list_ to avoid polluting the namespace
# Sorting for simple optimization
list_a = sorted([convert_to_datetime(time_str) for time_str in list_a])
list_b = sorted([convert_to_datetime(time_str) for time_str in list_b])
time_range_limit_in_seconds = timedelta(minutes=30).total_seconds()
result = []
for list_a_datetime in list_a:
with_in_time_limit = []
for list_b_datetime in list_b:
difference_in_seconds = (
list_a_datetime-list_b_datetime).total_seconds()
if difference_in_seconds <= time_range_limit_in_seconds:
# Convert back to string
with_in_time_limit.append(
list_b_datetime.strftime(time_format)
)
# Since the list is sorted, all the rest don't fall in time range
if difference_in_seconds < 0:
break
print(list_a_datetime.strftime(time_format), with_in_time_limit)
我有两个时间列表(时:分:秒格式),我一直在努力将 list_a
中的每个条目与 list_b
中的所有条目进行比较,以确定落在 30 以内的值分钟:
list_a = ["10:26:42", "8:55:43", "7:34:11"]
list_b = ["10:49:20", "8:51:10", "10:34:35", "8:39:47", "7:11:49", "7:42:10"]
预期输出:
10:26:42 is within 30m of 10:49:20, 10:34:35
8:55:43 is within 30m of 8:51:10, 8:39:47
7:34:11 is within 30m of 7:11:49, 7:42:10
到目前为止我一直在做的是:
import datetime
# Convert the Lists to Datetime Format
for data in list_a:
convert = datetime.datetime.strptime(data,"%H:%M:%S")
list_a_times.append(convert)
for data in list_b:
convert = datetime.datetime.strptime(data,"%H:%M:%S")
list_b_times.append(convert)
# Using a Value of List A, Find the Closest Value in List B
for data in list_a_times:
closest_to_data = min(list_b_times, key=lambda d: abs(d - data))
print(data, closest_to_data)
这种方法可行,但它只能找到一个最接近的值!我如何操作 min() 函数以在所需的 30 分钟或更短时间内继续提供值?
你在所有元素的绝对时间差异处循环并掠夺而不是使用 min
:
list_a = ["10:26:42", "8:55:43", "7:34:11"]
list_b = ["10:49:20", "8:51:10", "10:34:35", "8:39:47", "7:11:49", "7:42:10"]
import datetime
import datetime
# Convert the Lists to Datetime Format
list_a = [datetime.datetime.strptime(d,"%H:%M:%S") for d in list_a]
list_b = [datetime.datetime.strptime(d,"%H:%M:%S") for d in list_b]
for value in list_a:
for v in list_b:
if abs(value-v) < datetime.timedelta(minutes=30):
print (value, "=>", v, "diff: ", (value-v).total_seconds() // 60)
print()
输出:
1900-01-01 10:26:42 => 1900-01-01 10:49:20 diff: -23.0
1900-01-01 10:26:42 => 1900-01-01 10:34:35 diff: -8.0
1900-01-01 08:55:43 => 1900-01-01 08:51:10 diff: 4.0
1900-01-01 08:55:43 => 1900-01-01 08:39:47 diff: 15.0
1900-01-01 07:34:11 => 1900-01-01 07:11:49 diff: 22.0
1900-01-01 07:34:11 => 1900-01-01 07:42:10 diff: -8.0
对于像 0:05:00 和 23:55:00 这样的日期时间,这会出错,因为它们位于不同的日期。
您可以通过自己编写的增量计算来解决这个问题:
def abs_time_diff(dt1, dt2, *, ignore_date = False):
if not ignore_date:
return abs(dt1-dt2)
# use day before, this day and day after, report minimum
return min ( (abs(dt1 + datetime.timedelta(days = delta) - dt2)
for delta in range(-1,2)))
list_a = ["0:5:0"]
list_b = ["0:20:0", "23:55:0"]
list_a = [datetime.datetime.strptime(d,"%H:%M:%S") for d in list_a]
list_b = [datetime.datetime.strptime(d,"%H:%M:%S") for d in list_b]
for value in list_a:
for v in list_b:
print (value, v, abs_time_diff(value,v))
print (value, v, abs_time_diff(value,v, ignore_date = True))
输出:
1900-01-01 00:05:00 1900-01-01 00:20:00 0:15:00
1900-01-01 00:05:00 1900-01-01 00:20:00 0:15:00
1900-01-01 00:05:00 1900-01-01 23:55:00 23:50:00 # with date
1900-01-01 00:05:00 1900-01-01 23:55:00 0:10:00 # ignores date
IIUC,你想比较所有的组合,所以你需要检查所有。
请阅读答案末尾关于 datetime
/timedelta
的注释。
使用itertools.product
:
list_a = ['10:26:42', '8:55:43', '7:34:11']
list_b = ['10:49:20', '8:51:10', '10:34:35', '8:39:47', '7:11:49', '7:42:10']
import datetime
from itertools import product
str2time = lambda s: datetime.datetime.strptime(s, "%H:%M:%S")
for a,b in product(map(str2time, list_a), map(str2time, list_b)):
if abs(a-b).total_seconds() <= 1800:
print(f'{a:%H:%M:%S} is within 30m of {b:%H:%M:%S}')
输出:
10:26:42 is within 30m of 10:49:20
10:26:42 is within 30m of 10:34:35
08:55:43 is within 30m of 08:51:10
08:55:43 is within 30m of 08:39:47
07:34:11 is within 30m of 07:11:49
07:34:11 is within 30m of 07:42:10
使用嵌套 for 循环:
import datetime
str2time = lambda s: datetime.datetime.strptime(s, "%H:%M:%S")
for a in map(str2time, list_a):
start = f'{a:%H:%M:%S} is within 30m of'
for b in map(str2time, list_b):
if abs(a-b).total_seconds() <= 1800:
print(f'{start} {b:%H:%M:%S}', end='')
start = ','
if start == ',':
print()
输出:
10:26:42 is within 30m of 10:49:20, 10:34:35
08:55:43 is within 30m of 08:51:10, 08:39:47
07:34:11 is within 30m of 07:11:49, 07:42:10
关于 datetime
的注释
使用不带日期的 datetime
将默认为 1900-01-01,这可能会在接近午夜时产生边缘效应。相反,您可以使用 timedelta
对象。使用我的代码,您需要将 str2time
函数更改为:
def str2time(s):
h,m,s = map(int, s.split(':'))
return datetime.timedelta(hours=h, minutes=m, seconds
并稍微修改一下代码,以便能够转换为字符串:
z = datetime.datetime(1900,1,1)
for a in map(str2time, list_a):
start = f'{z+a:%H:%M:%S} is within 30m of'
for b in map(str2time, list_b):
if abs(a-b).total_seconds() <= 1800:
print(f'{start} {z+b:%H:%M:%S}', end='')
start = ','
if start == ',':
print()
我会给出一个使用 pandas
的建议:
# Convert to pandas datetime series
import pandas as pd
dt_a = pd.Series(list_a, dtype='datetime64[ns]')
dt_b = pd.Series(list_b, dtype='datetime64[ns]')
# Comparison loop
interv_size = '30m' # Thirty minutes
for el in dt_a:
hits = df_b.loc[ abs(el - df_b) < interv_size ].dt.time
print(f'{el.time()} is within {interv_size} of', *hits)
优势? 您让 python 处理日期的格式
from datetime import datetime, timedelta
list_a = ["10:26:42", "8:55:43", "7:34:11"]
list_b = ["10:49:20", "8:51:10", "10:34:35", "8:39:47", "7:11:49", "7:42:10"]
time_format = "%H:%M:%S"
def convert_to_datetime(time_str):
return datetime.strptime(time_str, time_format)
# Overriding list_a and list_ to avoid polluting the namespace
# Sorting for simple optimization
list_a = sorted([convert_to_datetime(time_str) for time_str in list_a])
list_b = sorted([convert_to_datetime(time_str) for time_str in list_b])
time_range_limit_in_seconds = timedelta(minutes=30).total_seconds()
result = []
for list_a_datetime in list_a:
with_in_time_limit = []
for list_b_datetime in list_b:
difference_in_seconds = (
list_a_datetime-list_b_datetime).total_seconds()
if difference_in_seconds <= time_range_limit_in_seconds:
# Convert back to string
with_in_time_limit.append(
list_b_datetime.strftime(time_format)
)
# Since the list is sorted, all the rest don't fall in time range
if difference_in_seconds < 0:
break
print(list_a_datetime.strftime(time_format), with_in_time_limit)