使用 "complex in" 条件在两个集合之间搜索匹配项

Question

假设我有两套：

set1 = set(("a", "b", 5), ("a", "c", 3))
set2 = set(("a", "b", 3), ("a", "b", 7))

每个项目有 3 个值：

第一个索引（键的第一部分）例如“一个”
第二个索引（键的第二部分）例如“b”
第三个索引（键的最后一部分的时间戳）例如 5

我想搜索集合 1 的所有元素，这些元素在集合 2 中有相应的项目，其中相应的项目是在第一个和第二个索引处具有相同值的任何项目，但值set2 中匹配项目的时间戳必须低于 set1 中项目的时间戳。

在示例中，匹配项将是：

~~("a", "b", 5) ("a", "b", 3)~~ 时间戳 3 < 5
✓ ("a", "b", 5) ("a", "b", 7)
~~("a", "c", 3) ("a", "b", 3)~~ "c" != "b"
~~("a", "c", 3) ("a", "b", 7)~~ "c" != "b"

现在的问题是我需要在 n.log(n) 时间内搜索 set2 中的项目。这就是为什么我试图将值保留在集合中。如果我正在寻找没有时间戳的项目：

set1 = set(("a", "b"), ("a", "c"))
set2 = set(("a", "b"), ("a", "b"))

我能做到：

for item from set1:
  if item in set2:
    print("Found matching item")

但是有了时间戳，我不知道如何在没有两个 for 循环的情况下实现它。

Answer 1

您可以从 set2 创建字典，其中键是元组的前两部分，值是所有时间戳的列表。然后你可以找到匹配的键并测试set1中的时间戳是否低于其中任何一个。

d = {}
for item in set2:
    d.setdefault(item[0:2], []).append(item[2])

for item in set1:
    if any(item[2] > x for x in set2.get(item[0:2], [])):
        print("Found match")

Answer 2

这是一个 O(n) 解决方案，其中 O(n) 附加 space 来存储 dict2 -

set1 = {("a", "b", 5), ("a", "c", 3)}
set2 = {("a", "b", 3), ("a", "b", 7)}

dict2 = {}
for item in set2:
    dict2[item[:2]] = max(dict2.get(item[:2], float('-inf')), item[2])

for item in set1:
    if dict2.get(item[0:2], float('-inf')) > item[2]:
        print("Found match")
        print("Match -", item, item[0:2] + (dict2.get(item[0:2]),))

输出-

Found match
Match - ('a', 'b', 5) ('a', 'b', 7)

我们创建一个新的 dict2 来存储 set2 中看到的两个键和最大时间戳，然后我们遍历 set1 中的项目以查看是否相同如果同一键的最大时间戳大于 set1 中的时间戳，则为键。

使用 "complex in" 条件在两个集合之间搜索匹配项

Searching for a matching items in between two sets with "complex in" condition

python

set

time-complexity