如何使用 BeautifulSoup 在 html 页面源代码中搜索特定关键字？

Question

我的目标是找出如何在 html 页面源代码中搜索特定关键字，并将 return 值作为 True/False。取决于是否找到关键字。

我要查找的特定关键字是 'cdn.secomapp.com'

现在我的代码如下所示：

from urllib import request
from bs4 import BeautifulSoup


url_1 = "https://cheapchicsdesigns.com"
keyword ='cdn.secomapp.com'
page = request.urlopen(url_1)
soup = BeautifulSoup(page)
soup.find_all("head", string=keyword)

但是当我运行这样做时 return 是一个空列表：

[]

有人可以帮忙吗？提前致谢

Answer 1

尝试：

from urllib import request
from bs4 import BeautifulSoup


url_1 = "https://cheapchicsdesigns.com"
keyword ='cdn.secomapp.com'
page = request.urlopen(url_1)
soup = BeautifulSoup(page, 'html.parser')
print(keyword in soup.text)

打印：

True

或者：

import requests
from bs4 import BeautifulSoup


url_1 = "https://cheapchicsdesigns.com"
keyword ='cdn.secomapp.com'
page = requests.get(url_1)
soup = BeautifulSoup(page.content, 'html.parser')
print(keyword in soup.text)

打印：

True

Answer 2

如果您的唯一目的是查看关键字是否存在，则无需构造 BeautifulSoup 对象。

from urllib import request

url_1 = "https://cheapchicsdesigns.com"
keyword ='cdn.secomapp.com'
page = request.urlopen(url_1)

print(keyword in page.read())

但我建议您使用 requests，因为它更简单

import requests

url_1 = "https://cheapchicsdesigns.com"
keyword ='cdn.secomapp.com'

res = requests.get(url_1)

print(keyword in res.text)

如何使用 BeautifulSoup 在 html 页面源代码中搜索特定关键字？

How to search for a specific keyword in html page source code with BeautifulSoup?

url

urllib

beautifulsoup

python-3.x