按名称、美汤和 python 获取元标记内容
Get meta tag content by name, beautiful soup and python
我正在尝试从该网站获取元数据(这是代码)。
import requests
from bs4 import BeautifulSoup
source = requests.get('https://www.svpboston.com/').text
soup = BeautifulSoup(source, features="html.parser")
title = soup.find("meta", name="description")
image = soup.find("meta", name="og:image")
print(title["content"] if title else "No meta title given")
print(image["content"]if title else "No meta title given")
但是我得到这个错误。
Traceback (most recent call last):
File "C:/Users/User/PycharmProjects/Work/Web Scraping/Selenium/sadsaddas.py", line 9, in <module>
title = soup.find("meta", name="description")
TypeError: find() got multiple values for argument 'name'
有什么想法吗?
find() 只接受一个参数。改用这个:
meta = soup.findall("meta")
title = meta.find(name="description")
image = meta.find(name="og:image")
你可以这样试试
title = soup.find("meta", attrs={"name":"description"})
image = soup.find("meta", attrs={"name":"og:image"})
print(title)
print(image)
print(title["content"] if title else "No meta title given")
print(image["content"] if image else "No meta for image given")
或
title = soup.find("meta", property="og:title")
print(title["content"] if title else "No meta title given")
来自bs4 docs
:
You can't use a keyword argument to search
for HTML’s name
element, because Beautiful Soup uses the name
argument to contain the name of the tag itself. Instead, you can give
a value to ‘name’ in the attrs argument
要通过特定属性获取标签,我建议您将其放入字典并将该字典作为 attrs
参数传递给 .find()
。但是您也传递了错误的属性来获取标题和图像。您应该使用 property=<...>
而不是 name=<...>
来获取 meta
标签。以下是获得所需内容的最终代码:
import requests
import requests
from bs4 import BeautifulSoup
source = requests.get('https://www.svpboston.com/').text
soup = BeautifulSoup(source, features="html.parser")
title = soup.find("meta", attrs={'property': 'og:title'})
image = soup.find("meta", attrs={'property': 'og:image'})
print(title["content"] if title is not None else "No meta title given")
print(image["content"] if title is not None else "No meta title given")
我正在尝试从该网站获取元数据(这是代码)。
import requests
from bs4 import BeautifulSoup
source = requests.get('https://www.svpboston.com/').text
soup = BeautifulSoup(source, features="html.parser")
title = soup.find("meta", name="description")
image = soup.find("meta", name="og:image")
print(title["content"] if title else "No meta title given")
print(image["content"]if title else "No meta title given")
但是我得到这个错误。
Traceback (most recent call last):
File "C:/Users/User/PycharmProjects/Work/Web Scraping/Selenium/sadsaddas.py", line 9, in <module>
title = soup.find("meta", name="description")
TypeError: find() got multiple values for argument 'name'
有什么想法吗?
find() 只接受一个参数。改用这个:
meta = soup.findall("meta")
title = meta.find(name="description")
image = meta.find(name="og:image")
你可以这样试试
title = soup.find("meta", attrs={"name":"description"})
image = soup.find("meta", attrs={"name":"og:image"})
print(title)
print(image)
print(title["content"] if title else "No meta title given")
print(image["content"] if image else "No meta for image given")
或
title = soup.find("meta", property="og:title")
print(title["content"] if title else "No meta title given")
来自bs4 docs
:
You can't use a keyword argument to search for HTML’s
name
element, because Beautiful Soup uses the name argument to contain the name of the tag itself. Instead, you can give a value to ‘name’ in the attrs argument
要通过特定属性获取标签,我建议您将其放入字典并将该字典作为 attrs
参数传递给 .find()
。但是您也传递了错误的属性来获取标题和图像。您应该使用 property=<...>
而不是 name=<...>
来获取 meta
标签。以下是获得所需内容的最终代码:
import requests
import requests
from bs4 import BeautifulSoup
source = requests.get('https://www.svpboston.com/').text
soup = BeautifulSoup(source, features="html.parser")
title = soup.find("meta", attrs={'property': 'og:title'})
image = soup.find("meta", attrs={'property': 'og:image'})
print(title["content"] if title is not None else "No meta title given")
print(image["content"] if title is not None else "No meta title given")