使用 BeautifulSoup 来抓取数据,但我没有获取所有数据
Use BeautifulSoup to scrape data but I'm not get data all
我想在 table 中抓取数据,但在我请求 url
后没有数据
import requests
url='https://iprice.hk/insights/mapofecommerce/iframe/?lang=en&loc=th'
headers={'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.99 Safari/537.36'}
r=requests.get(url,headers=headers)
print(r.content.decode())
from bs4 import BeautifulSoup
soup = BeautifulSoup(r.content,'html5lib')
table=soup.find('div', attrs={'id':'data'})
我检索数据失败
print(table.prettify())
不知道是不是跟数据权限有关
正如 Scott Hunter 所说,从 csv 中提取数据是一种解决方法:
import requests
import csv
download = requests.get('https://ipg-moe.s3-ap-southeast-1.amazonaws.com/th/2021-q3.csv')
decoded_content = download.content.decode('utf-8')
cr = csv.reader(decoded_content.splitlines(), delimiter=',')
rows = list(cr)
# first element of rows are column_names, we'll delete it
del rows[0]
for row in rows:
print(f'''
Name: {row[1]}
Url: {row[2]}
Traffic: {row[4]}
iOS: {row[5]}
Android: {row[6]}
Line: {row[7]}
Instagram: {row[8]}
Facebook: {row[9]}
Type: {row[10]}
Category: {row[11]}
Location: {row[12]}
''')
我想在 table 中抓取数据,但在我请求 url
后没有数据import requests
url='https://iprice.hk/insights/mapofecommerce/iframe/?lang=en&loc=th'
headers={'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.99 Safari/537.36'}
r=requests.get(url,headers=headers)
print(r.content.decode())
from bs4 import BeautifulSoup
soup = BeautifulSoup(r.content,'html5lib')
table=soup.find('div', attrs={'id':'data'})
我检索数据失败
print(table.prettify())
不知道是不是跟数据权限有关
正如 Scott Hunter 所说,从 csv 中提取数据是一种解决方法:
import requests
import csv
download = requests.get('https://ipg-moe.s3-ap-southeast-1.amazonaws.com/th/2021-q3.csv')
decoded_content = download.content.decode('utf-8')
cr = csv.reader(decoded_content.splitlines(), delimiter=',')
rows = list(cr)
# first element of rows are column_names, we'll delete it
del rows[0]
for row in rows:
print(f'''
Name: {row[1]}
Url: {row[2]}
Traffic: {row[4]}
iOS: {row[5]}
Android: {row[6]}
Line: {row[7]}
Instagram: {row[8]}
Facebook: {row[9]}
Type: {row[10]}
Category: {row[11]}
Location: {row[12]}
''')