将字典列表中的值映射到 Python 中的字符串

Question

我正在研究这样的句子结构：

sentence = "PERSON is ADJECTIVE"
dictionary = {"PERSON": ["Alice", "Bob", "Carol"], "ADJECTIVE": ["cute", "intelligent"]}

我现在需要所有可能的组合来从字典中形成这个句子，例如：

Alice is cute
Alice is intelligent
Bob is cute
Bob is intelligent
Carol is cute
Carol is intelligent

上面的用例比较简单，用下面的代码就搞定了

dictionary = {"PERSON": ["Alice", "Bob", "Carol"], "ADJECTIVE": ["cute", "intelligent"]}

for i in dictionary["PERSON"]:
    for j in dictionary["ADJECTIVE"]:
        print(f"{i} is {j}")

但是我们是否也可以针对更长的句子进行扩展？

示例：

sentence = "PERSON is ADJECTIVE and is from COUNTRY" 
dictionary = {"PERSON": ["Alice", "Bob", "Carol"], "ADJECTIVE": ["cute", "intelligent"], "COUNTRY": ["USA", "Japan", "China", "India"]}

这应该再次提供所有可能的组合，例如：

Alice is cute and is from USA
Alice is intelligent and is from USA
.
.
.
.
Carol is intelligent and is from India

我试过用 https://www.pythonpool.com/python-permutations/ ，但是句子全乱了 - 但是我们怎么才能固定几个词呢，比如在这个例子中 "and is from" 是固定的

基本上，如果字典中的任何键等于字符串中的单词，则该单词应替换为字典中的值列表。

任何想法都会很有帮助。

Answer 1

您可以先将sentence中的字典键替换为{}，这样您就可以轻松地在循环中格式化字符串。然后你可以使用 itertools.product 来创建 dictionary.values() 的笛卡尔积，这样你就可以简单地循环它来创建你想要的句子。

from itertools import product
sentence = ' '.join([('{}' if w in dictionary else w) for w in sentence.split()])
mapped_sentences_generator = (sentence.format(*tple) for tple in product(*dictionary.values()))
for s in mapped_sentences_generator:
    print(s)

输出：

Alice is cute and is from USA
Alice is cute and is from Japan
Alice is cute and is from China
Alice is cute and is from India
Alice is intelligent and is from USA
Alice is intelligent and is from Japan
Alice is intelligent and is from China
Alice is intelligent and is from India
Bob is cute and is from USA
Bob is cute and is from Japan
Bob is cute and is from China
Bob is cute and is from India
Bob is intelligent and is from USA
Bob is intelligent and is from Japan
Bob is intelligent and is from China
Bob is intelligent and is from India
Carol is cute and is from USA
Carol is cute and is from Japan
Carol is cute and is from China
Carol is cute and is from India
Carol is intelligent and is from USA
Carol is intelligent and is from Japan
Carol is intelligent and is from China
Carol is intelligent and is from India

请注意，这适用于 Python >3.6，因为它假设字典插入顺序保持不变。对于较旧的 Python，必须使用 collections.OrderedDict 而不是 dict。

Answer 2

我的回答基于两个构建基块 itertools.product 和 zip。

itertools.product 将允许我们获得字典列表值的各种组合

zip 与原始键和上面的组合将允许我们创建一个我们可以与 replace.

一起使用的元组列表

import itertools

sentence = "PERSON is ADJECTIVE and is from COUNTRY"
dictionary = {"PERSON": ["Alice", "Bob", "Carol"], "ADJECTIVE": ["cute", "intelligent"], "COUNTRY": ["USA", "Japan", "China", "India"]}

keys = dictionary.keys()
for values in itertools.product(*dictionary.values()):
    new_sentence = sentence
    for tpl in zip(keys, values):
        new_sentence = new_sentence.replace(*tpl)
    print(new_sentence)

IF你正好有能力驾驭“句子”模板，你可以这样做：

sentence = "{PERSON} is {ADJECTIVE} and is from {COUNTRY}"

那么你可以将其简化为：

sentence = "{PERSON} is {ADJECTIVE} and is from {COUNTRY}"
dictionary = {"PERSON": ["Alice", "Bob", "Carol"], "ADJECTIVE": ["cute", "intelligent"], "COUNTRY": ["USA", "Japan", "China", "India"]}

keys = dictionary.keys()
for values in itertools.product(*dictionary.values()):
    new_sentence = sentence.format(**dict(zip(keys, values)))
    print(new_sentence)

两者都应该为您提供如下结果：

Alice is cute and is from USA
Alice is cute and is from Japan
...
Carol is intelligent and is from China
Carol is intelligent and is from India

请注意，模板中出现的顺序并不重要，两种解决方案都应使用以下模板：

sentence = "PERSON is from COUNTRY and is ADJECTIVE"

或者情况 2

sentence = "{PERSON} is from {COUNTRY} and is {ADJECTIVE}"

跟进：

如果字典中有可能包含句子模板中没有的项目，会发生什么情况？目前，这并不理想，因为使用 product() 生成句子的方式假设所有键都是，而我们目前会生成重复项。

最简单的解决方法是确保字典只包含感兴趣的键...

在第一种情况下，它可能会这样做。

dictionary = {key: value for key, value in dictionary.items() if key in sentence}

或者第二种情况：

dictionary = {key: value for key, value in dictionary.items() if f"{{{key}}}" in sentence}

将字典列表中的值映射到 Python 中的字符串

Mapping values from a dictionary's list to a string in Python

python

nlp

list

python-2.7

python-3.x