如何从迭代中获取输出，将其存储在字典中

Question

所以我有这个脚本 (运行 Python 3.5) 使用 Google API 和报纸。它搜索 google 与睡眠有关的文章。然后使用 Newspaper 遍历这些 URL。我要求报纸做的只是 return 那篇文章的关键字列表，我通过写作 article.keywords 来调用它。

for url in google.search('sleep', num=2, stop=1):
    article = Article(url)      
    article.download() 
    article.parse()
    article.nlp()     
    print(article.keywords)

returned（对于给定文章）的关键字如下所示：

['education', 'nights', 'start', 'pill', 'supplement', 'research', 'national', 'sleep', 'sleeping', 'trouble', 'using', 'taking']

但我想创建一个字典，其中包含所有结果的所有关键字：也就是说，要迭代的每篇文章的关键字。我该怎么做？

Answer 1

假设字典键应该是一篇文章url:

keywords = {}
for url in google.search('sleep', num=2, stop=1):
    article = Article(url)      
    article.download() 
    article.parse()
    article.nlp()  

    keywords[url] = article.keywords

print(keywords)

或者，如果您想获得所有文章中所有关键字的列表：

keywords = []
for url in google.search('sleep', num=2, stop=1):
    article = Article(url)      
    article.download() 
    article.parse()
    article.nlp()  

    keywords += article.keywords

print(keywords)

Answer 2

防止多次插入关键字（与另一个答案几乎相同）

keywords = []
for url in google.search('sleep', num=2, stop=1):
  article = Article(url)      
  article.download() 
  article.parse()
  article.nlp()
  for kw in article.keywords:
    if kw not in keywords:
      keywords.append( kw )

或者更好的是，使用 set 而不是 list。

如何从迭代中获取输出，将其存储在字典中

How to take output from iterating, store that in a dictionary

python

google-api

python-3.x

python-newspaper