在列表中查找所有可能对的高效且不消耗内存的方法

Question

我有一本名为 lemma_all_context_dict 的字典，它有大约 8000 个键。我需要这些键的所有可能对的列表。

我用过：

pairs_of_words_list = list(itertools.combinations(lemma_all_context_dict.keys(), 2))

然而，当使用这条线时，我得到一个 MemoryError。我有 8GB 的 RAM，但也许我还是遇到了这个错误，因为我在这段代码中有一些非常大的词典。

所以我尝试了不同的方法：

pairs_of_words_list = []
for p_one in range(len(lemma_all_context_dict.keys())):
        for p_two in range(p_one+1,len(lemma_all_context_dict.keys())):
                pairs_of_words_list.append([lemma_all_context_dict.keys()[p_one],lemma_all_context_dict.keys()[p_two]])

但是这段代码需要大约 20 分钟才能运行...有谁知道解决问题的更有效方法吗？谢谢

**我不认为这个问题是重复的，因为我要问的——而且我认为没有人问过——是如何在我的电脑不崩溃的情况下实现这些东西:-P

Answer 1

不要建立一个列表，因为那是你得到内存错误的原因（你甚至创建了两个列表，因为那是 .keys() 所做的）。您可以遍历 iterator（这是他们的目的）：

for a, b in itertools.combinations(lemma_all_context_dict, 2):
    print a, b

在列表中查找所有可能对的高效且不消耗内存的方法

Efficient and not memory consuming way to find all possible pairs in list

python

memory