如何检查日期时间是否在日期时间间隔的字典列表中

How to check if a Datetime is within a list of dictionaries of datetime intervals

我有一个包含一系列时间间隔的 YAML 文件。我想将此 YML 加载到我的 Python 脚本中,然后检查另一个日期时间是否在任何这些间隔内。

YML 文件:

ignore_state_change:
  - start_time_utc: 2021-08-04 23:25:00.31099+00:00
    end_time_utc: 2021-08-05 00:25:00.31099+00:00
  - start_time_utc: 2021-08-05 01:25:00.31099+00:00
    end_time_utc: 2021-08-05 02:25:00.31099+00:00

正在加载脚本:

import yaml
with open(r'test.yml') as file:
    config = yaml.load(file, Loader=yaml.FullLoader)
for item in config['ignore_state_change']:
    print(item)

结果:

{'start_time_utc': datetime.datetime(2021, 8, 4, 23, 25, 0, 310990, tzinfo=datetime.timezone.utc), 'end_time_utc': datetime.datetime(2021, 8, 5, 0, 25, 0, 310990, tzinfo=datetime.timezone.utc)}
{'start_time_utc': datetime.datetime(2021, 8, 5, 1, 25, 0, 310990, tzinfo=datetime.timezone.utc), 'end_time_utc': datetime.datetime(2021, 8, 5, 2, 25, 0, 310990, tzinfo=datetime.timezone.utc)}

现在我想检查日期时间变量 event_time_utc 是否在上述任何 start/end 时间间隔内。遍历每个字典项并使用 >start_time_utc 和

谢谢!

这是一个可行的 O((n log n) + (m log n)) 解决方案(其中 n 是 StartTime-EndTime 对的计数,m 是要搜索的日期 event_time_utc 的计数) yaml 文件中是否有重叠日期或 none.

  1. O(n) 获取所有 StartTime-EndTime 对
    [['2021-01-15' - '2021-01-25'], ['2021-03-01' - '2021-03-02'], ['2021-01-01' - '2021-01-20']]
    
  2. O(n log n) 对 StartTime-EndTime 对进行排序,最早的在列表的开头,最新的在列表的末尾。
    [['2021-01-01' - '2021-01-20'], ['2021-01-15' - '2021-01-25'], ['2021-03-01' - '2021-03-02']]
    
  3. O(n) 合并排序列表中每个连续的 StartTime-EndTime 对(如果它们涵盖相同的时间)。
    [['2021-01-01' - '2021-01-25'], ['2021-03-01' - '2021-03-02']]
    
  4. O(m log n) 现在,我们需要为每个日期 event_time_utc 做的就是对排序合并列表执行二进制搜索。由于它已经排序和合并,我们现在可以只检查日期是否在排序列表插入点的 StartTime-EndTime 对的范围内。

test.yml

ignore_state_change:
  - start_time_utc: 2021-08-04 23:25:00.31099+00:00
    end_time_utc: 2021-08-05 00:25:00.31099+00:00
  - start_time_utc: 2021-10-01 00:00:00.00000+00:00
    end_time_utc: 2021-10-02 23:59:59.99999+00:00
  - start_time_utc: 2021-08-17 23:23:23.00000+00:00
    end_time_utc: 2021-09-30 23:59:59.99999+00:00
  - start_time_utc: 2021-08-09 22:22:22.00000+00:00
    end_time_utc: 2021-08-17 23:23:23.00000+00:00
  - start_time_utc: 2021-08-05 01:25:00.31099+00:00
    end_time_utc: 2021-08-05 02:25:00.31099+00:00
  - start_time_utc: 2020-05-25 00:00:00.00000+00:00
    end_time_utc: 2020-06-25 00:00:00.00000+00:00
  - start_time_utc: 2021-08-11 00:00:00.00000+00:00
    end_time_utc: 2021-08-13 23:59:59.99999+00:00
  - start_time_utc: 2021-08-11 00:00:00.00000+00:00
    end_time_utc: 2021-08-13 23:59:59.99999+00:00
  - start_time_utc: 2020-06-15 00:00:00.00000+00:00
    end_time_utc: 2020-07-25 00:00:00.00000+00:00

script.py

import bisect
from datetime import datetime, timezone
import yaml

# Get the timestamps
with open(r'test.yml') as file:
    config = yaml.load(file, Loader=yaml.FullLoader)
print("Config file")
for i, a in enumerate(config['ignore_state_change']):
    print(i, a)
print()

# Sort the timestamps
dt_sorted = []
for item in config['ignore_state_change']:
    bisect.insort(dt_sorted, [item["start_time_utc"], item["end_time_utc"]])
print("Sorted timestamps")
for i, a in enumerate(dt_sorted):
    print(i, a)
print()

# Merge the timestamps
dt_merged = [dt_sorted[0]]
for dt in dt_sorted[1:]:
    if dt[0] <= dt_merged[-1][1]:
        dt_merged[-1][1] = max(dt_merged[-1][1], dt[1])
    else:
        dt_merged.append(dt)
print("Merged timestamps")
for i, a in enumerate(dt_merged):
    print(i, a)
print()

# Binary search each time in the merged timestamps
print("Results")
for event_time_utc in [
    datetime(2021, 10, 1, 0, 0, tzinfo=timezone.utc),
    datetime(2021, 8, 17, 23, 23, 23, tzinfo=timezone.utc),
    datetime(2021, 8, 5, 0, 25, 0, 310991, tzinfo=timezone.utc),
    datetime(2021, 8, 5, 0, 25, 0, 310990, tzinfo=timezone.utc),
    datetime(2021, 8, 5, 0, 26, 0, 310990, tzinfo=timezone.utc),
    datetime(2021, 8, 5, 1, 26, 0, 310990, tzinfo=timezone.utc),
    datetime(2021, 8, 5, 1, 25, 0, 310990, tzinfo=timezone.utc),
    datetime(2020, 5, 24, 0, 0, tzinfo=timezone.utc),
    datetime(2020, 5, 25, 0, 0, tzinfo=timezone.utc),
    datetime(2020, 6, 25, 1, 0, tzinfo=timezone.utc),
    datetime(2020, 6, 25, 0, 0, tzinfo=timezone.utc),
    datetime(1993, 1, 2, 3, 4, tzinfo=timezone.utc),
    datetime(2021, 10, 2, 23, 59, 59, 999990, tzinfo=timezone.utc),
    datetime(2021, 11, 2, 23, 59, 59, 999990, tzinfo=timezone.utc),
    datetime(2021, 8, 14, 1, 26, 0, 310990, tzinfo=timezone.utc),
]:
    ref_point = bisect.bisect_left(dt_merged, [event_time_utc, event_time_utc])
    for index in (ref_point - 1, ref_point):
        if (
            0 <= index < len(dt_merged)
            and dt_merged[index][0] <= event_time_utc <= dt_merged[index][1]
        ):
            print(f"{event_time_utc} is in range of {dt_merged[index]}")
            break
    else:
        print(f"{event_time_utc} is not in range")

输出

Config file
0 {'start_time_utc': datetime.datetime(2021, 8, 4, 23, 25, 0, 310990, tzinfo=datetime.timezone.utc), 'end_time_utc': datetime.datetime(2021, 8, 5, 0, 25, 0, 310990, tzinfo=datetime.timezone.utc)}
1 {'start_time_utc': datetime.datetime(2021, 10, 1, 0, 0, tzinfo=datetime.timezone.utc), 'end_time_utc': datetime.datetime(2021, 10, 2, 23, 59, 59, 999990, tzinfo=datetime.timezone.utc)}
2 {'start_time_utc': datetime.datetime(2021, 8, 17, 23, 23, 23, tzinfo=datetime.timezone.utc), 'end_time_utc': datetime.datetime(2021, 9, 30, 23, 59, 59, 999990, tzinfo=datetime.timezone.utc)}
3 {'start_time_utc': datetime.datetime(2021, 8, 9, 22, 22, 22, tzinfo=datetime.timezone.utc), 'end_time_utc': datetime.datetime(2021, 8, 17, 23, 23, 23, tzinfo=datetime.timezone.utc)}
4 {'start_time_utc': datetime.datetime(2021, 8, 5, 1, 25, 0, 310990, tzinfo=datetime.timezone.utc), 'end_time_utc': datetime.datetime(2021, 8, 5, 2, 25, 0, 310990, tzinfo=datetime.timezone.utc)}
5 {'start_time_utc': datetime.datetime(2020, 5, 25, 0, 0, tzinfo=datetime.timezone.utc), 'end_time_utc': datetime.datetime(2020, 6, 25, 0, 0, tzinfo=datetime.timezone.utc)}
6 {'start_time_utc': datetime.datetime(2021, 8, 11, 0, 0, tzinfo=datetime.timezone.utc), 'end_time_utc': datetime.datetime(2021, 8, 13, 23, 59, 59, 999990, tzinfo=datetime.timezone.utc)}
7 {'start_time_utc': datetime.datetime(2021, 8, 11, 0, 0, tzinfo=datetime.timezone.utc), 'end_time_utc': datetime.datetime(2021, 8, 13, 23, 59, 59, 999990, tzinfo=datetime.timezone.utc)}
8 {'start_time_utc': datetime.datetime(2020, 6, 15, 0, 0, tzinfo=datetime.timezone.utc), 'end_time_utc': datetime.datetime(2020, 7, 25, 0, 0, tzinfo=datetime.timezone.utc)}

Sorted timestamps
0 [datetime.datetime(2020, 5, 25, 0, 0, tzinfo=datetime.timezone.utc), datetime.datetime(2020, 6, 25, 0, 0, tzinfo=datetime.timezone.utc)]
1 [datetime.datetime(2020, 6, 15, 0, 0, tzinfo=datetime.timezone.utc), datetime.datetime(2020, 7, 25, 0, 0, tzinfo=datetime.timezone.utc)]
2 [datetime.datetime(2021, 8, 4, 23, 25, 0, 310990, tzinfo=datetime.timezone.utc), datetime.datetime(2021, 8, 5, 0, 25, 0, 310990, tzinfo=datetime.timezone.utc)]
3 [datetime.datetime(2021, 8, 5, 1, 25, 0, 310990, tzinfo=datetime.timezone.utc), datetime.datetime(2021, 8, 5, 2, 25, 0, 310990, tzinfo=datetime.timezone.utc)]
4 [datetime.datetime(2021, 8, 9, 22, 22, 22, tzinfo=datetime.timezone.utc), datetime.datetime(2021, 8, 17, 23, 23, 23, tzinfo=datetime.timezone.utc)]
5 [datetime.datetime(2021, 8, 11, 0, 0, tzinfo=datetime.timezone.utc), datetime.datetime(2021, 8, 13, 23, 59, 59, 999990, tzinfo=datetime.timezone.utc)]
6 [datetime.datetime(2021, 8, 11, 0, 0, tzinfo=datetime.timezone.utc), datetime.datetime(2021, 8, 13, 23, 59, 59, 999990, tzinfo=datetime.timezone.utc)]
7 [datetime.datetime(2021, 8, 17, 23, 23, 23, tzinfo=datetime.timezone.utc), datetime.datetime(2021, 9, 30, 23, 59, 59, 999990, tzinfo=datetime.timezone.utc)]
8 [datetime.datetime(2021, 10, 1, 0, 0, tzinfo=datetime.timezone.utc), datetime.datetime(2021, 10, 2, 23, 59, 59, 999990, tzinfo=datetime.timezone.utc)]

Merged timestamps
0 [datetime.datetime(2020, 5, 25, 0, 0, tzinfo=datetime.timezone.utc), datetime.datetime(2020, 7, 25, 0, 0, tzinfo=datetime.timezone.utc)]
1 [datetime.datetime(2021, 8, 4, 23, 25, 0, 310990, tzinfo=datetime.timezone.utc), datetime.datetime(2021, 8, 5, 0, 25, 0, 310990, tzinfo=datetime.timezone.utc)]
2 [datetime.datetime(2021, 8, 5, 1, 25, 0, 310990, tzinfo=datetime.timezone.utc), datetime.datetime(2021, 8, 5, 2, 25, 0, 310990, tzinfo=datetime.timezone.utc)]
3 [datetime.datetime(2021, 8, 9, 22, 22, 22, tzinfo=datetime.timezone.utc), datetime.datetime(2021, 9, 30, 23, 59, 59, 999990, tzinfo=datetime.timezone.utc)]
4 [datetime.datetime(2021, 10, 1, 0, 0, tzinfo=datetime.timezone.utc), datetime.datetime(2021, 10, 2, 23, 59, 59, 999990, tzinfo=datetime.timezone.utc)]

Results
2021-10-01 00:00:00+00:00 is in range of [datetime.datetime(2021, 10, 1, 0, 0, tzinfo=datetime.timezone.utc), datetime.datetime(2021, 10, 2, 23, 59, 59, 999990, tzinfo=datetime.timezone.utc)]
2021-08-17 23:23:23+00:00 is in range of [datetime.datetime(2021, 8, 9, 22, 22, 22, tzinfo=datetime.timezone.utc), datetime.datetime(2021, 9, 30, 23, 59, 59, 999990, tzinfo=datetime.timezone.utc)]
2021-08-05 00:25:00.310991+00:00 is not in range
2021-08-05 00:25:00.310990+00:00 is in range of [datetime.datetime(2021, 8, 4, 23, 25, 0, 310990, tzinfo=datetime.timezone.utc), datetime.datetime(2021, 8, 5, 0, 25, 0, 310990, tzinfo=datetime.timezone.utc)]
2021-08-05 00:26:00.310990+00:00 is not in range
2021-08-05 01:26:00.310990+00:00 is in range of [datetime.datetime(2021, 8, 5, 1, 25, 0, 310990, tzinfo=datetime.timezone.utc), datetime.datetime(2021, 8, 5, 2, 25, 0, 310990, tzinfo=datetime.timezone.utc)]
2021-08-05 01:25:00.310990+00:00 is in range of [datetime.datetime(2021, 8, 5, 1, 25, 0, 310990, tzinfo=datetime.timezone.utc), datetime.datetime(2021, 8, 5, 2, 25, 0, 310990, tzinfo=datetime.timezone.utc)]
2020-05-24 00:00:00+00:00 is not in range
2020-05-25 00:00:00+00:00 is in range of [datetime.datetime(2020, 5, 25, 0, 0, tzinfo=datetime.timezone.utc), datetime.datetime(2020, 7, 25, 0, 0, tzinfo=datetime.timezone.utc)]
2020-06-25 01:00:00+00:00 is in range of [datetime.datetime(2020, 5, 25, 0, 0, tzinfo=datetime.timezone.utc), datetime.datetime(2020, 7, 25, 0, 0, tzinfo=datetime.timezone.utc)]
2020-06-25 00:00:00+00:00 is in range of [datetime.datetime(2020, 5, 25, 0, 0, tzinfo=datetime.timezone.utc), datetime.datetime(2020, 7, 25, 0, 0, tzinfo=datetime.timezone.utc)]
1993-01-02 03:04:00+00:00 is not in range
2021-10-02 23:59:59.999990+00:00 is in range of [datetime.datetime(2021, 10, 1, 0, 0, tzinfo=datetime.timezone.utc), datetime.datetime(2021, 10, 2, 23, 59, 59, 999990, tzinfo=datetime.timezone.utc)]
2021-11-02 23:59:59.999990+00:00 is not in range
2021-08-14 01:26:00.310990+00:00 is in range of [datetime.datetime(2021, 8, 9, 22, 22, 22, tzinfo=datetime.timezone.utc), datetime.datetime(2021, 9, 30, 23, 59, 59, 999990, tzinfo=datetime.timezone.utc)]