Beautifulsoup 不适用于所有网址

Beautifulsoup doesn't work properly with all urls

错误说:

AttributeError: 'NoneType' object has no attribute 'get_text'

我正在学习网络抓取教程,一切正常 this url, when I wanted to change it to this url 我已经提到的错误出现了。

爬虫功能:

def product_crawler():
    page = requests.get(url, headers=headers)
    soup = BeautifulSoup(page.content, 'html.parser')
    title = soup.find(id="productTitle").get_text()
    print(title)

我检查了 Whosebug 上的所有答案,例如将 html.parser 更改为 lxml ,但没有一个有效。

尝试添加 Accept-Language HTTP header:

import requests
from bs4 import BeautifulSoup

url = "https://www.amazon.com/dp/B08DK5ZH44"

headers = {
    "User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:89.0) Gecko/20100101 Firefox/89.0",
    "Accept-Language": "en-US,en;q=0.5",
}

page = requests.get(url, headers=headers)
soup = BeautifulSoup(page.content, "html.parser")
title = soup.find(id="productTitle").get_text(strip=True)
print(title)

打印:

GoPro HERO9 Black - Waterproof Action Camera with Front LCD and Touch Rear Screens, 5K Ultra HD Video, 20MP Photos, 1080p Live Streaming, Webcam, Stabilization