可以在字典理解期间过滤重复值吗？

Question

我正在创建一个以索引为键的随机坐标字典，坐标必须是唯一的。

我知道我可以在创建字典键后返回并过滤重复项，但是是否可以确保在通过理解创建字典的过程中没有重复项?

在伪代码中我想做这样的事情：

 d = {k: (randint,randint) for k in range if (randint,randint) not in d}

但这引用了赋值前的字典 d。

明确地说，我是通过理解创建字典，而不是修改现有字典。

Answer 1

如评论中所述，您应该使用循环。

想耍点花样，有老办法：

>>> from random import randint, seed
>>> seen = set()
>>> {k: (x, y) for k, x, y in ((k, randint(0,5),randint(0,5)) for k in range(50)) if (x, y) not in seen and not seen.add((x, y))}
{0: (3, 4), 1: (3, 3), 2: (4, 4), 3: (1, 1), 4: (4, 3), 5: (5, 4), 6: (1, 0), 7: (3, 2), 9: (4, 5), 10: (5, 0), 12: (3, 5), 14: (5, 1), 15: (4, 0), 17: (0, 0), 22: (1, 4), 23: (1, 5), 24: (2, 3), 25: (0, 5), 26: (0, 3), 27: (5, 2), 30: (2, 2), 32: (2, 0), 33: (0, 4), 35: (0, 2), 36: (3, 0), 38: (0, 1), 41: (5, 3)}

生成器((k, randint(0,5),randint(0,5)) for k in range(50))生成50个带索引的坐标。测试元组 k, x, ,y：if (x, y) not in seen，我们评估 and 的第二部分：seen.add((x, y) returns None，因此 not seen.add((x, y) 始终是 True，但会产生副作用（将元素添加到 seen）。

缺点是dict的大小取决于重复的数量。要选择元素的数量，您必须： 1. 创建一个无限生成器； 2.取前N个唯一元素。

您可以使用 iter 函数和假哨兵创建一个无限生成器（(-1, -1) 永远不会产生）：

>>> it = iter(lambda: (randint(0,5),randint(0,5)), (-1, -1))

现在使用 itertools 标准模块，您可以取前 10 个元素而不重复：

>>> import itertools
>>> seen = set()
>>> dict(enumerate(itertools.islice(((x,y) for x, y in it if (x, y) not in seen and not seen.add((x, y))), 10)))
{0: (5, 0), 1: (0, 0), 2: (3, 3), 3: (1, 5), 4: (4, 1), 5: (3, 4), 6: (1, 3), 7: (5, 3), 8: (0, 3), 9: (3, 1)}

但是你应该使用循环...

可以在字典理解期间过滤重复值吗？

Possible to filter duplicate values during dictionary comprehension?

python

dictionary

duplicates

dictionary-comprehension