如何找到多个列表之间的交集?

How can I find an intersection among multiple lists?

我有多个数组,我想找到它们之间的交集我尝试了以下代码。

my_lists = [['Finish', 'Purpose', 'Form', 'Series', 'Tiles Type', 'Finishing'], ['Color', 'Thickness', 'Usage/Application', 'Brand', 'Marble Type', 'Material'], ['Color', 'Brand', 'Finishing', 'Origin', 'Marble Type', 'Thickness'], ['Thickness', 'Form', 'Size', 'Series', 'Usage/Application', 'Finishing'], ['Thickness', 'Material Grade', 'Size', 'Usage/Application', 'Material'], ['Usage/Application', 'Form', 'Finishing', 'Brand', 'Material', 'Shape'], ['Application Area', 'Form', 'Finishing', 'Brand', 'Color', 'Coverage Area'], ['Usage/Application', 'Marble Type', 'Thickness', 'Brand', 'Form'], ['Unit Size (mm X mm)', 'Marble Type', 'Thickness', 'Finishing', 'Usage', 'Brand'], ['Marble Type', 'Unit Size (mm X mm)', 'Usage', 'Thickness', 'Color'], ['color'], ['Thickness', 'Size', 'Usage/Application', 'Series', 'Finish', 'Marble Type'], ['Thickness', 'Usage/Application', 'Brand', 'Color', 'Marble Type', 'Unit Size (mm X mm)'], ['Color', 'Marble Type', 'Usage'], ['Thickness', 'Size', 'Material', 'Finish', 'Packaging Size', 'Packaging Type'], ['Color', 'Material', 'Thickness', 'Usage/Application', 'Back Lit', 'Brand'], ['Material', 'Pattern', 'Shape'], ['Form', 'Application Area', 'Material', 'Thickness', 'Colour', 'Finishing'], ['Color', 'Usage/Application', 'Brand', 'Series'], ['Color', 'Material', 'Thickness', 'Usage/Application', 'Brand', 'Surface Finish'], ['Brand', 'Color', 'Usage/Application', 'Thickness', 'Size', 'Finish'], ['Form', 'Material', 'Usage', 'Marble Type', 'Thickness', 'Finishing'], ['Form', 'Color', 'Marble Type', 'Unit Size', 'Features', 'Coverage Area'], ['Usage', 'Form'], ['Finish', 'Application Area', 'Purpose', 'Thickness', 'Pattern'], ['Usage/Application', 'Finishing', 'Material', 'Brand', 'Size', 'Category Type'], ['Usage/Application', 'Size', 'Color', 'Marble Type', 'Features', 'Finishing'], ['Marble Type', 'Surface Finishing', 'Stone Form', 'Usage'], ['Brand', 'Material', 'Finish', 'Thickness', 'Size']]
print(set.intersection(*map(set,list(my_lists ))))

但是我得到一个空集

set()

其实我想要的是在所有列表中找到共同的元素

我认为这会有所帮助;

from functools import reduce
reduce(numpy.intersect1d, (my_lists))

来源: https://numpy.org/doc/stable/reference/generated/numpy.intersect1d.html

您的示例中的所有列表之间没有共同元素 - 您可以看到第一个列表和第二个列表完全不相交。因此,空集的正确返回答案。此操作将仅查找 EACH 列表中的任何字符串。

编辑

如果您的目标是找到 曾经 重复的字符串,我会执行如下操作:

import numpy as np
my_lists = [['Finish', 'Purpose', 'Form', 'Series', 'Tiles Type', 'Finishing'], ['Color', 'Thickness', 'Usage/Application', 'Brand', 'Marble Type', 'Material'], ['Color', 'Brand', 'Finishing', 'Origin', 'Marble Type', 'Thickness'], ['Thickness', 'Form', 'Size', 'Series', 'Usage/Application', 'Finishing'], ['Thickness', 'Material Grade', 'Size', 'Usage/Application', 'Material'], ['Usage/Application', 'Form', 'Finishing', 'Brand', 'Material', 'Shape'], ['Application Area', 'Form', 'Finishing', 'Brand', 'Color', 'Coverage Area'], ['Usage/Application', 'Marble Type', 'Thickness', 'Brand', 'Form'], ['Unit Size (mm X mm)', 'Marble Type', 'Thickness', 'Finishing', 'Usage', 'Brand'], ['Marble Type', 'Unit Size (mm X mm)', 'Usage', 'Thickness', 'Color'], ['color'], ['Thickness', 'Size', 'Usage/Application', 'Series', 'Finish', 'Marble Type'], ['Thickness', 'Usage/Application', 'Brand', 'Color', 'Marble Type', 'Unit Size (mm X mm)'], ['Color', 'Marble Type', 'Usage'], ['Thickness', 'Size', 'Material', 'Finish', 'Packaging Size', 'Packaging Type'], ['Color', 'Material', 'Thickness', 'Usage/Application', 'Back Lit', 'Brand'], ['Material', 'Pattern', 'Shape'], ['Form', 'Application Area', 'Material', 'Thickness', 'Colour', 'Finishing'], ['Color', 'Usage/Application', 'Brand', 'Series'], ['Color', 'Material', 'Thickness', 'Usage/Application', 'Brand', 'Surface Finish'], ['Brand', 'Color', 'Usage/Application', 'Thickness', 'Size', 'Finish'], ['Form', 'Material', 'Usage', 'Marble Type', 'Thickness', 'Finishing'], ['Form', 'Color', 'Marble Type', 'Unit Size', 'Features', 'Coverage Area'], ['Usage', 'Form'], ['Finish', 'Application Area', 'Purpose', 'Thickness', 'Pattern'], ['Usage/Application', 'Finishing', 'Material', 'Brand', 'Size', 'Category Type'], ['Usage/Application', 'Size', 'Color', 'Marble Type', 'Features', 'Finishing'], ['Marble Type', 'Surface Finishing', 'Stone Form', 'Usage'], ['Brand', 'Material', 'Finish', 'Thickness', 'Size']]
big_list = [x for a_list in my_lists for x in a_list]
unique_strings, number_of_appearances = np.unique(big_list, return_counts=True)
index = np.flip(np.argsort(number_of_appearances))
print(unique_strings[index], number_of_appearances[index])

这会展平您的列表列表,找到唯一的字符串,并根据它们出现的次数(从多到少)对它们进行排序。第一个字符串将是“找到最多的元素”,任何计数超过 1 的字符串都会在多个列表中重复。