检查列表中的单词是否在另一个列表的字符串中 Python
Check wether words from a list are inside a string of another list Python
所以我尝试获取纽约时报首页的所有头条新闻,想看看某个词被提到了多少次。在这种特殊情况下,我想看看有多少头条新闻提到了冠状病毒或特朗普。这是我的代码,但它不会工作,因为 'number' 仍然是我在 while 循环之前给它的整数。
import requests
from bs4 import BeautifulSoup
url = 'https://www.nytimes.com'
r = requests.get(url)
soup = BeautifulSoup(r.text, "html.parser")
a = soup.findAll("h2", class_="esl82me0")
for story_heading in a:
print(story_heading.contents[0])
lijst = ["trump", "Trump", "Corona", "COVID", "virus", "Virus", "Coronavirus", "COVID-19"]
number = 0
run = 0
while run < len(a)+1:
run += 1
if any(lijst in s for s in a)
number += 1
print("\nTrump or the Corona virus have been mentioned", number, "times.")
所以我基本上希望变量 'number' 在标题(列表 a 中的条目)中包含 Trump 或 Coronavirus 或两者时增加 1。
有人知道怎么做吗?
总的来说,我建议在命名变量时多加考虑。我喜欢你尝试打印故事标题的方式。 if any(lijst in s for s in a)
行没有按照您的预期执行:您需要在单个 h2 中迭代每个单词。 any
函数只是以下函数的缩写:
def any(iterable):
for element in iterable:
if element:
return True
return False
换句话说,您试图查看整个列表是否在 h2 元素中,这永远不会成立。这是一个修复示例。
import requests
from bs4 import BeautifulSoup
url = 'https://www.nytimes.com'
r = requests.get(url)
soup = BeautifulSoup(r.text, "html.parser")
h2s = soup.findAll("h2", class_="esl82me0")
for story_heading in h2s:
print(story_heading.contents[0])
keywords = ["trump", "Trump", "Corona", "COVID", "virus", "Virus", "Coronavirus", "COVID-19"]
number = 0
run = 0
for h2 in h2s:
headline = h2.text
words_in_headline = headline.split(" ")
for word in words_in_headline:
if word in keywords:
number += 1
print("\nTrump or the Corona virus have been mentioned", number, "times.")
输出
Trump or the Corona virus have been mentioned 7 times.
所以我尝试获取纽约时报首页的所有头条新闻,想看看某个词被提到了多少次。在这种特殊情况下,我想看看有多少头条新闻提到了冠状病毒或特朗普。这是我的代码,但它不会工作,因为 'number' 仍然是我在 while 循环之前给它的整数。
import requests
from bs4 import BeautifulSoup
url = 'https://www.nytimes.com'
r = requests.get(url)
soup = BeautifulSoup(r.text, "html.parser")
a = soup.findAll("h2", class_="esl82me0")
for story_heading in a:
print(story_heading.contents[0])
lijst = ["trump", "Trump", "Corona", "COVID", "virus", "Virus", "Coronavirus", "COVID-19"]
number = 0
run = 0
while run < len(a)+1:
run += 1
if any(lijst in s for s in a)
number += 1
print("\nTrump or the Corona virus have been mentioned", number, "times.")
所以我基本上希望变量 'number' 在标题(列表 a 中的条目)中包含 Trump 或 Coronavirus 或两者时增加 1。
有人知道怎么做吗?
总的来说,我建议在命名变量时多加考虑。我喜欢你尝试打印故事标题的方式。 if any(lijst in s for s in a)
行没有按照您的预期执行:您需要在单个 h2 中迭代每个单词。 any
函数只是以下函数的缩写:
def any(iterable):
for element in iterable:
if element:
return True
return False
换句话说,您试图查看整个列表是否在 h2 元素中,这永远不会成立。这是一个修复示例。
import requests
from bs4 import BeautifulSoup
url = 'https://www.nytimes.com'
r = requests.get(url)
soup = BeautifulSoup(r.text, "html.parser")
h2s = soup.findAll("h2", class_="esl82me0")
for story_heading in h2s:
print(story_heading.contents[0])
keywords = ["trump", "Trump", "Corona", "COVID", "virus", "Virus", "Coronavirus", "COVID-19"]
number = 0
run = 0
for h2 in h2s:
headline = h2.text
words_in_headline = headline.split(" ")
for word in words_in_headline:
if word in keywords:
number += 1
print("\nTrump or the Corona virus have been mentioned", number, "times.")
输出
Trump or the Corona virus have been mentioned 7 times.