Python 从 kickstarter 抓取数据的代码在一些迭代后不起作用
Python code scraping data from kickstarter does not work after some iteration
我尝试从 kickstarter 抓取数据,代码可以正常工作,但在第 15 页出现以下错误(由于网页是动态的,您可能会在不同的页面出现错误):
Traceback (most recent call last): File "C:\Users\lenovo\kick.py",
line 30, in
csvwriter.writerow(row) File "C:\Users\lenovo\AppData\Local\Programs\Python\Python37\lib\encodings\cp1252.py",
line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0] UnicodeEncodeError: 'charmap' codec can't encode character '\uff5c' in
position 27: character maps to
可能是什么问题?有什么建议吗?
from urllib.request import urlopen
from bs4 import BeautifulSoup
import json
import csv
KICKSTARTER_SEARCH_URL = "https://www.kickstarter.com/discover/advanced?category_id=16&sort=newest&seed=2502593&page={}"
DATA_FILE = "kickstarter.csv"
csvfile = open(DATA_FILE, 'w')
csvwriter = csv.writer(csvfile, delimiter=',')
page_start = 0
while True:
url = KICKSTARTER_SEARCH_URL.format(page_start)
print(url)
response = urlopen(url)
html = response.read()
soup = BeautifulSoup(html, 'html.parser')
project_details_divs = soup.findAll('div', {"class":"js-react-proj-card"})
if len(project_details_divs) == 0:
break;
for div in project_details_divs:
project = json.loads(div['data-project'])
row = [project["id"],project["name"],project["goal"],project["pledged"]]
csvwriter.writerow(row)
page_start +=1
csvfile.close()
将参数 encoding
添加到您的文件打开器。我是说,改变
csvfile = open(DATA_FILE, 'w')
进入
csvfile = open(DATA_FILE, 'w', encoding='utf-8')
我尝试从 kickstarter 抓取数据,代码可以正常工作,但在第 15 页出现以下错误(由于网页是动态的,您可能会在不同的页面出现错误):
Traceback (most recent call last): File "C:\Users\lenovo\kick.py", line 30, in csvwriter.writerow(row) File "C:\Users\lenovo\AppData\Local\Programs\Python\Python37\lib\encodings\cp1252.py", line 19, in encode return codecs.charmap_encode(input,self.errors,encoding_table)[0] UnicodeEncodeError: 'charmap' codec can't encode character '\uff5c' in position 27: character maps to
可能是什么问题?有什么建议吗?
from urllib.request import urlopen
from bs4 import BeautifulSoup
import json
import csv
KICKSTARTER_SEARCH_URL = "https://www.kickstarter.com/discover/advanced?category_id=16&sort=newest&seed=2502593&page={}"
DATA_FILE = "kickstarter.csv"
csvfile = open(DATA_FILE, 'w')
csvwriter = csv.writer(csvfile, delimiter=',')
page_start = 0
while True:
url = KICKSTARTER_SEARCH_URL.format(page_start)
print(url)
response = urlopen(url)
html = response.read()
soup = BeautifulSoup(html, 'html.parser')
project_details_divs = soup.findAll('div', {"class":"js-react-proj-card"})
if len(project_details_divs) == 0:
break;
for div in project_details_divs:
project = json.loads(div['data-project'])
row = [project["id"],project["name"],project["goal"],project["pledged"]]
csvwriter.writerow(row)
page_start +=1
csvfile.close()
将参数 encoding
添加到您的文件打开器。我是说,改变
csvfile = open(DATA_FILE, 'w')
进入
csvfile = open(DATA_FILE, 'w', encoding='utf-8')