我应该如何获得列表中重复子列表的列表?
How should I get a list of duplicate sublists in a list?
我正在尝试创建允许我获取列表的唯一子列表的列表的函数。这些函数适用于某些列表列表,但不适用于其他列表,我不确定为什么。
获取重复子列表的索引然后构建它们的列表的可靠有效方法是什么?
以下最小工作示例说明了该功能。已找到列表 a
的重复项,但未正确找到列表 b
.
的重复项
def indices_of_list_element_duplicates(x):
seen = set()
for index, element in enumerate(x):
if isinstance(element, list):
element = tuple(element)
if element not in seen:
seen.add(element)
else:
yield index
def list_element_duplicates(x):
indices = list(indices_of_list_element_duplicates(x))
return [x[index] for index in indices]
a = [[1, 2], [1, 2], [2, 2], [3, 2], [4, 2], [5, 2], [5, 2]]
print(list_element_duplicates(a))
print("--------------------------------------------------------------------------------")
b = [[10], [15], [20], [10, 10], [10, 15], [10, 20], [15, 10], [15, 15], [15, 20], [20, 10], [20, 15], [20, 20], [10, 10, 10], [10, 10, 15], [10, 10, 20], [10, 15, 10], [10, 15, 15], [10, 15, 20], [10, 20, 10], [10, 20, 15], [10, 20, 20], [15, 10, 10], [15, 10, 15], [15, 10, 20], [15, 15, 10], [15, 15, 15], [15, 15, 20], [15, 20, 10], [15, 20, 15], [15, 20, 20], [20, 10, 10], [20, 10, 15], [20, 10, 20], [20, 15, 10], [20, 15, 15], [20, 15, 20], [20, 20, 10], [20, 20, 15], [20, 20, 20], [10], [15], [20], [10, 10], [10, 15], [10, 20], [15, 10], [15, 15], [15, 20], [20, 10], [20, 15], [20, 20], [10, 10, 10], [10, 10, 15], [10, 10, 20], [10, 15, 10], [10, 15, 15], [10, 15, 20], [10, 20, 10], [10, 20, 15], [10, 20, 20], [15, 10, 10], [15, 10, 15], [15, 10, 20], [15, 15, 10], [15, 15, 15], [15, 15, 20], [15, 20, 10], [15, 20, 15], [15, 20, 20], [20, 10, 10], [20, 10, 15], [20, 10, 20], [20, 15, 10], [20, 15, 15], [20, 15, 20], [20, 20, 10], [20, 20, 15], [20, 20, 20], [10], [15], [20], [10, 10], [10, 15], [10, 20], [15, 10], [15, 15], [15, 20], [20, 10], [20, 15], [20, 20], [10, 10, 10], [10, 10, 15], [10, 10, 20], [10, 15, 10], [10, 15, 15], [10, 15, 20], [10, 20, 10], [10, 20, 15], [10, 20, 20], [15, 10, 10], [15, 10, 15], [15, 10, 20], [15, 15, 10], [15, 15, 15], [15, 15, 20], [15, 20, 10], [15, 20, 15], [15, 20, 20], [20, 10, 10], [20, 10, 15], [20, 10, 20], [20, 15, 10], [20, 15, 15], [20, 15, 20], [20, 20, 10], [20, 20, 15], [20, 20, 20], [10], [15], [20], [10, 10], [10, 15], [10, 20], [15, 10], [15, 15], [15, 20], [20, 10], [20, 15], [20, 20], [10, 10, 10], [10, 10, 15], [10, 10, 20], [10, 15, 10], [10, 15, 15], [10, 15, 20], [10, 20, 10], [10, 20, 15], [10, 20, 20], [15, 10, 10], [15, 10, 15], [15, 10, 20], [15, 15, 10], [15, 15, 15], [15, 15, 20], [15, 20, 10], [15, 20, 15], [15, 20, 20], [20, 10, 10], [20, 10, 15], [20, 10, 20], [20, 15, 10], [20, 15, 15], [20, 15, 20], [20, 20, 10], [20, 20, 15], [20, 20, 20]]
print(list_element_duplicates(b))
Python 列表有一个漂亮的内置函数,叫做 count
。使用这个你可以做:
a = [[1, 2], [1, 2], [2, 2], [3, 2], [4, 2], [5, 2], [5, 2]]
dups = list()
for e in a:
if a.count(e) > 1:
dups.append(e)
这将为您提供一个名为 dups
的列表,其中包含 [[1,2],[1,2],[5,2],[5,2]]
您可以使用 Counter dict 将子列表映射到元组并获取计数,仅保留计数大于 1 的子列表:
from collections import Counter
a = [[1, 2], [1, 2], [2, 2], [3, 2], [4, 2], [5, 2], [5, 2]]
cn = Counter(map(tuple,a))
print([sub for sub in a if cn[tuple(sub)] > 1])
适用于混合类型并获得独特性 returns:
from collections import Counter
a = [[1, 2], [1, 2], [2, 2], [3, 2], [4, 2], [5, 2], [5, 2], "foo", 123, 123]
def counts(x):
for ele in x:
if isinstance(ele, Hashable):
yield ele
else:
yield tuple(ele)
def unique_dupes(x):
cnts = Counter(counts(x))
for ele in x:
t = ele
if not isinstance(ele, Hashable):
t = tuple(ele)
if cnts[t] > 1:
yield ele
del cnts[t]
print(list(unique_dupes(a)))
输出:
[[1, 2], [5, 2], 123]
问题必须来自这些行:
if isinstance(element, list):
element = tuple(element)
if element not in seen:
seen.add(element)
所以如果你已经在 seen
中说例如 [10,15]
然后你想检查 [15,10]
中看到的,它会 return FALSE
.
当您认为 [x,y]
与 [y,x]
相同时,解决此问题的方法是对您检查的每个元素进行排序,这样:
if isinstance(element, list):
element = tuple(sorted(element))
if element not in seen:
seen.add(element)
列表理解很容易
list_a = [[1, 2], [1, 2], [2, 2], [3, 2], [4, 2], [5, 2], [5, 2]]
unique_list=[]
duplicate_list=[]
sorted_list=[sorted(item) for item in list_a]
final_list=[unique_list.append(item) if item not in unique_list else duplicate_list.append(item) for item in sorted_list]
print(unique_list)
print(duplicate_list)
我正在尝试创建允许我获取列表的唯一子列表的列表的函数。这些函数适用于某些列表列表,但不适用于其他列表,我不确定为什么。
获取重复子列表的索引然后构建它们的列表的可靠有效方法是什么?
以下最小工作示例说明了该功能。已找到列表 a
的重复项,但未正确找到列表 b
.
def indices_of_list_element_duplicates(x):
seen = set()
for index, element in enumerate(x):
if isinstance(element, list):
element = tuple(element)
if element not in seen:
seen.add(element)
else:
yield index
def list_element_duplicates(x):
indices = list(indices_of_list_element_duplicates(x))
return [x[index] for index in indices]
a = [[1, 2], [1, 2], [2, 2], [3, 2], [4, 2], [5, 2], [5, 2]]
print(list_element_duplicates(a))
print("--------------------------------------------------------------------------------")
b = [[10], [15], [20], [10, 10], [10, 15], [10, 20], [15, 10], [15, 15], [15, 20], [20, 10], [20, 15], [20, 20], [10, 10, 10], [10, 10, 15], [10, 10, 20], [10, 15, 10], [10, 15, 15], [10, 15, 20], [10, 20, 10], [10, 20, 15], [10, 20, 20], [15, 10, 10], [15, 10, 15], [15, 10, 20], [15, 15, 10], [15, 15, 15], [15, 15, 20], [15, 20, 10], [15, 20, 15], [15, 20, 20], [20, 10, 10], [20, 10, 15], [20, 10, 20], [20, 15, 10], [20, 15, 15], [20, 15, 20], [20, 20, 10], [20, 20, 15], [20, 20, 20], [10], [15], [20], [10, 10], [10, 15], [10, 20], [15, 10], [15, 15], [15, 20], [20, 10], [20, 15], [20, 20], [10, 10, 10], [10, 10, 15], [10, 10, 20], [10, 15, 10], [10, 15, 15], [10, 15, 20], [10, 20, 10], [10, 20, 15], [10, 20, 20], [15, 10, 10], [15, 10, 15], [15, 10, 20], [15, 15, 10], [15, 15, 15], [15, 15, 20], [15, 20, 10], [15, 20, 15], [15, 20, 20], [20, 10, 10], [20, 10, 15], [20, 10, 20], [20, 15, 10], [20, 15, 15], [20, 15, 20], [20, 20, 10], [20, 20, 15], [20, 20, 20], [10], [15], [20], [10, 10], [10, 15], [10, 20], [15, 10], [15, 15], [15, 20], [20, 10], [20, 15], [20, 20], [10, 10, 10], [10, 10, 15], [10, 10, 20], [10, 15, 10], [10, 15, 15], [10, 15, 20], [10, 20, 10], [10, 20, 15], [10, 20, 20], [15, 10, 10], [15, 10, 15], [15, 10, 20], [15, 15, 10], [15, 15, 15], [15, 15, 20], [15, 20, 10], [15, 20, 15], [15, 20, 20], [20, 10, 10], [20, 10, 15], [20, 10, 20], [20, 15, 10], [20, 15, 15], [20, 15, 20], [20, 20, 10], [20, 20, 15], [20, 20, 20], [10], [15], [20], [10, 10], [10, 15], [10, 20], [15, 10], [15, 15], [15, 20], [20, 10], [20, 15], [20, 20], [10, 10, 10], [10, 10, 15], [10, 10, 20], [10, 15, 10], [10, 15, 15], [10, 15, 20], [10, 20, 10], [10, 20, 15], [10, 20, 20], [15, 10, 10], [15, 10, 15], [15, 10, 20], [15, 15, 10], [15, 15, 15], [15, 15, 20], [15, 20, 10], [15, 20, 15], [15, 20, 20], [20, 10, 10], [20, 10, 15], [20, 10, 20], [20, 15, 10], [20, 15, 15], [20, 15, 20], [20, 20, 10], [20, 20, 15], [20, 20, 20]]
print(list_element_duplicates(b))
Python 列表有一个漂亮的内置函数,叫做 count
。使用这个你可以做:
a = [[1, 2], [1, 2], [2, 2], [3, 2], [4, 2], [5, 2], [5, 2]]
dups = list()
for e in a:
if a.count(e) > 1:
dups.append(e)
这将为您提供一个名为 dups
的列表,其中包含 [[1,2],[1,2],[5,2],[5,2]]
您可以使用 Counter dict 将子列表映射到元组并获取计数,仅保留计数大于 1 的子列表:
from collections import Counter
a = [[1, 2], [1, 2], [2, 2], [3, 2], [4, 2], [5, 2], [5, 2]]
cn = Counter(map(tuple,a))
print([sub for sub in a if cn[tuple(sub)] > 1])
适用于混合类型并获得独特性 returns:
from collections import Counter
a = [[1, 2], [1, 2], [2, 2], [3, 2], [4, 2], [5, 2], [5, 2], "foo", 123, 123]
def counts(x):
for ele in x:
if isinstance(ele, Hashable):
yield ele
else:
yield tuple(ele)
def unique_dupes(x):
cnts = Counter(counts(x))
for ele in x:
t = ele
if not isinstance(ele, Hashable):
t = tuple(ele)
if cnts[t] > 1:
yield ele
del cnts[t]
print(list(unique_dupes(a)))
输出:
[[1, 2], [5, 2], 123]
问题必须来自这些行:
if isinstance(element, list):
element = tuple(element)
if element not in seen:
seen.add(element)
所以如果你已经在 seen
中说例如 [10,15]
然后你想检查 [15,10]
中看到的,它会 return FALSE
.
当您认为 [x,y]
与 [y,x]
相同时,解决此问题的方法是对您检查的每个元素进行排序,这样:
if isinstance(element, list):
element = tuple(sorted(element))
if element not in seen:
seen.add(element)
列表理解很容易
list_a = [[1, 2], [1, 2], [2, 2], [3, 2], [4, 2], [5, 2], [5, 2]]
unique_list=[]
duplicate_list=[]
sorted_list=[sorted(item) for item in list_a]
final_list=[unique_list.append(item) if item not in unique_list else duplicate_list.append(item) for item in sorted_list]
print(unique_list)
print(duplicate_list)