找到多个集合的所有可能交集的最佳方法是什么?

What is the best way to find ALL possible intersections of multiple sets?

假设我有以下 4 套

Set1 = {1,2,3,4,5}
Set2 = {4,5,6,7}
Set3 = {6,7,8,9,10}
Set4 = {1,8,9,15}

我想找到这些集合中所有可能的交集,例如:

Set1 and Set4: 1
Set1 and Set2: 4,5
Set2 and Set3: 6,7
Set3 and Set4: 8,9

就 python 而言,最好的方法是什么?谢谢!

来自here

# Python3 program for intersection() function 

set1 = {2, 4, 5, 6}  
set2 = {4, 6, 7, 8}  
set3 = {4,6,8} 

# union of two sets 
print("set1 intersection set2 : ", set1.intersection(set2)) 

# union of three sets 
print("set1 intersection set2 intersection set3 :", set1.intersection(set2,set3)) 

并且来自 docs

intersection(*others)

set & other & ...

Return a new set with elements common to the set and all others.

您需要找到 2 组组合(从您想要的输出中扣除)。这可以使用 [Python 3.Docs]: itertools.combinations(iterable, r) 来实现。对于每个组合,应执行 2 组之间的交集。
为了执行上述操作,(输入)集在列表(可迭代)中 "grouped"。

也指出[Python 3.docs]: class set([iterable])

code.py:

#!/usr/bin/env python3

import sys
import itertools


def main():
    set1 = {1, 2, 3, 4, 5}
    set2 = {4, 5, 6, 7}
    set3 = {6, 7, 8, 9, 10}
    set4 = {1, 8, 9, 15}

    sets = [set1, set2, set3, set4]

    for index_set_pair in itertools.combinations(enumerate(sets, start=1), 2):
        (index_first, set_first), (index_second, set_second) = index_set_pair
        intersection = set_first.intersection(set_second)
        if intersection:
            print("Set{:d} and Set{:d} = {:}".format(index_first, index_second, intersection))


if __name__ == "__main__":
    print("Python {:s} on {:s}\n".format(sys.version, sys.platform))
    main()
    print("\nDone.")

请注意 [Python 3.Docs]: Built-in Functions - enumerate(iterable, start=0) 仅用于打印目的 (Set1, Set2, ... 在输出中).

输出:

[cfati@CFATI-5510-0:e:\Work\Dev\Whosebug\q056551261]> "e:\Work\Dev\VEnvs\py_064_03.07.03_test0\Scripts\python.exe" code.py
Python 3.7.3 (v3.7.3:ef4ec6ed12, Mar 25 2019, 22:22:05) [MSC v.1916 64 bit (AMD64)] on win32

Set1 and Set2 = {4, 5}
Set1 and Set4 = {1}
Set2 and Set3 = {6, 7}
Set3 and Set4 = {8, 9}

Done.

如果您只是寻找两个集合的交集,您可以简单地嵌套 for 循环:

Set1 = {1,2,3,4,5}
Set2 = {4,5,6,7}
Set3 = {6,7,8,9,10}
Set4 = {1,8,9,15}
sets = [Set1,Set2,Set3,Set4]
for i,s1 in enumerate(sets[:-1]):
    for j,s2 in enumerate(sets[i+1:]):
        print(f"Set{i+1} and Set{i+j+2} = {s1&s2}")

# Set1 and Set2 = {4, 5}
# Set1 and Set3 = set()
# Set1 and Set4 = {1}
# Set2 and Set3 = {6, 7}
# Set2 and Set4 = set()
# Set3 and Set4 = {8, 9}

如果您正在寻找这些集合中任意数量的交集,那么您可以使用 itertools 中的 combinations() 来生成指数的幂集并为每个组合执行交集:

from itertools import combinations
for comboSize in range(2,len(sets)):
    for combo in combinations(range(len(sets)),comboSize):
        intersection = sets[combo[0]]
        for i in combo[1:]: intersection = intersection & sets[i]
        print(" and ".join(f"Set{i+1}" for i in combo),"=",intersection)

Set1 and Set2 = {4, 5}
Set1 and Set3 = set()
Set1 and Set4 = {1}
Set2 and Set3 = {6, 7}
Set2 and Set4 = set()
Set3 and Set4 = {8, 9}
Set1 and Set2 = {4, 5}
Set1 and Set3 = set()
Set1 and Set4 = {1}
Set2 and Set3 = {6, 7}
Set2 and Set4 = set()
Set3 and Set4 = {8, 9}
Set1 and Set2 and Set3 = set()
Set1 and Set2 and Set4 = set()
Set1 and Set3 and Set4 = set()
Set2 and Set3 and Set4 = set()
  • “幂集”是指从一组值中获取各种大小的所有可能组合。 itertools documentation 有一个方法。
  • 在这种情况下,我们只对 2,3,..., n-1 大小的组合感兴趣。因此 comboSize in range(2,len(sets))
  • 上的循环
  • 对于这些大小中的每一个,我们使用 itertool 的 combinations 函数在 sets 列表中获得索引组合。例如对于集合中的 comboSize=3 和 4 个项目,组合将得到:(0, 1, 2) (0, 1, 3) (0, 2, 3) (1, 2, 3)
  • 交集将使用 & operator(集合交集)从第一个索引 (combo[0]) 开始与其余索引 ( combo[1:]) 成一组。
  • 打印函数将集合标识符 (f"Set{i+1}") 与 " and " 字符串连接起来,并在同一行上打印结果交集。
  • f"Set{i+1}" 是一个 format string,它将 {i+1} 替换为组合 (+1) 中的集合索引。