以有效的方式在另一个列表中找到唯一列表

Question

solution = [[1,0,0],[0,1,0], [1,0,0], [1,0,0]]

我有上面的嵌套列表，其中包含一些其他列表，我们需要如何在解决方案中获取唯一列表

output = [[1,0,0],[0,1,0]

注意：每个列表大小相同

我尝试过的事情：

获取每个列表并与所有其他列表进行比较，看看它是否重复？但是很慢..

插入列表之前如何检查，是否有重复项以避免插入重复项

Answer 1

Pandas duplicate 可能会有帮助。

import pandas as pd
df=pd.DataFrame([[1,0,0],[0,1,0], [1,0,0], [1,0,0]])
d =df[~df.duplicated()].values.tolist()

输出

[[1, 0, 0], [0, 1, 0]]

或者，由于您标记 multidimensional-array，您可以使用 numpy 方法。

import numpy as np
def unique_rows(a):
    a = np.ascontiguousarray(a)
    unique_a = np.unique(a.view([('', a.dtype)]*a.shape[1]))
    return unique_a.view(a.dtype).reshape((unique_a.shape[0], a.shape[1]))
arr=np.array([[1,0,0],[0,1,0], [1,0,0], [1,0,0]])
output=unique_rows(arr).tolist()

基于此 OP

中的建议

Answer 2

由于列表是可变对象，因此您无法真正快速地检查身份。但是，您可以转换为元组，并将每个列表的 tuple-ized 视图存储在一个集合中。

元组是异构不可变容器，不同于可变且惯用同质的列表。

from typing import List, Any

def de_dupe(lst: List[List[Any]]) -> List[List[Any]]:
    seen = set()
    output = []
    for element in lst:
        tup = tuple(element)
        if tup in seen:
            continue  # we've already added this one
        seen.add(tup)
        output.append(element)
    return output

solution = [[1,0,0],[0,1,0], [1,0,0], [1,0,0]]
assert de_dupe(solution) == [[1, 0, 0], [0, 1, 0]]

Answer 3

如果不在意顺序，可以用set:

solution = [[1,0,0],[0,1,0],[1,0,0],[1,0,0]]

output = set(map(tuple, solution))
print(output) # {(1, 0, 0), (0, 1, 0)}

Answer 4

虽然列表不可哈希，因此复制效率低下，但元组是。因此，一种方法是将您的列表转换为元组并复制它们。

>>> solution_tuples = [(1,0,0), (0,1,0), (1,0,0), (1,0,0)]
>>> set(solution_tuples)
{(1, 0, 0), (0, 1, 0)}

Answer 5

试试这个解决方案：

x=[[1,0,0],[0,1,0],[1,0,0],[1,0,0]]

导入numpy并将嵌套列表转换为numpy数组

将 numpy 导入为 np

a1=np.array(x)

跨行查找唯一值

a2 = np.unique(a1,轴=0)

将其转换回嵌套列表

a2.tolist()

希望对您有所帮助

以有效的方式在另一个列表中找到唯一列表

find unique lists inside another list in an efficient way

python

list

nested-lists

multidimensional-array

导入numpy并将嵌套列表转换为numpy数组

跨行查找唯一值

将其转换回嵌套列表