TypeError 显示在我的输出终端 "NoneType object is not subscriptable"
TypeError shows on my output terminal "NoneType object is not subscriptable"
我在提取 'href' 时遇到问题,这里是 html 代码:
<a href="https://www.akinsfoodltd.co.uk?utm_source=yell&utm_medium=referral&utm_campaign=yell" data-tracking="WL:CLOSED" class="btn btn-yellow businessCapsule--ctaItem" target="_blank" rel="nofollow noopener">
<div class="icon icon-Business-website" title="Visit Akin's Food Ltd's Website"></div> Website</a>
这是我的代码:
from bs4 import BeautifulSoup
import requests
import csv
url ='https://www.yell.com/ucs/UcsSearchAction.do?keywords=Food&location=United+Kingdom&scrambleSeed=1316051868'
header = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.114 Safari/537.36/8mqNJauL-25'}
response = requests.get(url, headers=header)
product = soup.find_all('div', 'row businessCapsule--mainRow')
#print(product)
for x in product:
name = x.find('h2', {'itemprop': 'name'}).text
address = x.find('span', {'itemprop': 'streetAddress'}).text
post_code = x.find('span', {'itemprop': 'postalCode'}).text
telp = x.find('span', 'business--telephoneNumber').text
web = x.find('a', {'rel': 'nofollow noopener'})["href"]
print(web)
在输出终端显示:
TypeError: 'NoneType' object is not subscriptable
您正试图从某个不存在 href 的容器中抓取 href,这就是您遇到此类错误的原因。以下是处理该错误的几种方法之一:
import requests
from bs4 import BeautifulSoup
url = 'https://www.yell.com/ucs/UcsSearchAction.do?keywords=Food&location=United+Kingdom&scrambleSeed=1316051868'
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.114 Safari/537.36/8mqNJauL-25'}
res = requests.get(url,headers=headers)
soup = BeautifulSoup(res.text,"html.parser")
for item in soup.find_all(class_='businessCapsule--mainRow'):
name = item.find('h2',class_='businessCapsule--name').text
phone = item.find(class_='business--telephoneNumber').text
try:
website = item.find('a',{'data-tracking':'WL:CLOSED'}).get("href")
except (TypeError,AttributeError): website = ""
print(name,phone,website)
我在提取 'href' 时遇到问题,这里是 html 代码:
<a href="https://www.akinsfoodltd.co.uk?utm_source=yell&utm_medium=referral&utm_campaign=yell" data-tracking="WL:CLOSED" class="btn btn-yellow businessCapsule--ctaItem" target="_blank" rel="nofollow noopener">
<div class="icon icon-Business-website" title="Visit Akin's Food Ltd's Website"></div> Website</a>
这是我的代码:
from bs4 import BeautifulSoup
import requests
import csv
url ='https://www.yell.com/ucs/UcsSearchAction.do?keywords=Food&location=United+Kingdom&scrambleSeed=1316051868'
header = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.114 Safari/537.36/8mqNJauL-25'}
response = requests.get(url, headers=header)
product = soup.find_all('div', 'row businessCapsule--mainRow')
#print(product)
for x in product:
name = x.find('h2', {'itemprop': 'name'}).text
address = x.find('span', {'itemprop': 'streetAddress'}).text
post_code = x.find('span', {'itemprop': 'postalCode'}).text
telp = x.find('span', 'business--telephoneNumber').text
web = x.find('a', {'rel': 'nofollow noopener'})["href"]
print(web)
在输出终端显示:
TypeError: 'NoneType' object is not subscriptable
您正试图从某个不存在 href 的容器中抓取 href,这就是您遇到此类错误的原因。以下是处理该错误的几种方法之一:
import requests
from bs4 import BeautifulSoup
url = 'https://www.yell.com/ucs/UcsSearchAction.do?keywords=Food&location=United+Kingdom&scrambleSeed=1316051868'
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.114 Safari/537.36/8mqNJauL-25'}
res = requests.get(url,headers=headers)
soup = BeautifulSoup(res.text,"html.parser")
for item in soup.find_all(class_='businessCapsule--mainRow'):
name = item.find('h2',class_='businessCapsule--name').text
phone = item.find(class_='business--telephoneNumber').text
try:
website = item.find('a',{'data-tracking':'WL:CLOSED'}).get("href")
except (TypeError,AttributeError): website = ""
print(name,phone,website)