UnicodeEncodeError: 'charmap' codec can't encode character u'\xfd' in python 2.7
UnicodeEncodeError: 'charmap' codec can't encode character u'\xfd' in python 2.7
我从我的本地主机从不同的网站下载不同的公司名称,有时我会遇到这个问题,那就是中断下载 procedure.My 脚本在其他国家/地区工作正常,但是当我下载捷克共和国时,这种类型的错误是发生了。
Total companies processed so far:0 Traceback (most recent call last):
File "process1.py", line 261, in
print "Company Name: "+hit.text File "C:\Python27\lib\encodings\cp437.py", line 12, in encode
return codecs.charmap_encode(input,errors,encoding_map) UnicodeEncodeError: 'charmap' codec can't encode character u'\xfd' in
position 3 3: character maps to
我的代码在这里:
if companyAlreadyKnown == 0:
for hit in soup2.findAll("h1"):
print "Company Name: "+hit.text
pCompanyName = hit.text
flog.write("\nCompany Name: "+str(pCompanyName))
companyObj.setCompanyName(pCompanyName)
不知道这个问题为什么是happened.Any建议?
捷克语包含大量非 ASCII 字符。 u'\xfd'
是 ý
的 unicode 表示。您需要解码 UTF-8
。一个更好的解决方案是检测您正在抓取的网站使用的编码并解码为该编码。
if companyAlreadyKnown == 0:
for hit in soup2.findAll("h1"):
company_name = hit.text.decode('utf-8')
print "Company Name: " + company_name
flog.write("\nCompany Name: " + pCompanyName)
companyObj.setCompanyName(company_name)
我从我的本地主机从不同的网站下载不同的公司名称,有时我会遇到这个问题,那就是中断下载 procedure.My 脚本在其他国家/地区工作正常,但是当我下载捷克共和国时,这种类型的错误是发生了。
Total companies processed so far:0 Traceback (most recent call last): File "process1.py", line 261, in print "Company Name: "+hit.text File "C:\Python27\lib\encodings\cp437.py", line 12, in encode return codecs.charmap_encode(input,errors,encoding_map) UnicodeEncodeError: 'charmap' codec can't encode character u'\xfd' in position 3 3: character maps to
我的代码在这里:
if companyAlreadyKnown == 0:
for hit in soup2.findAll("h1"):
print "Company Name: "+hit.text
pCompanyName = hit.text
flog.write("\nCompany Name: "+str(pCompanyName))
companyObj.setCompanyName(pCompanyName)
不知道这个问题为什么是happened.Any建议?
捷克语包含大量非 ASCII 字符。 u'\xfd'
是 ý
的 unicode 表示。您需要解码 UTF-8
。一个更好的解决方案是检测您正在抓取的网站使用的编码并解码为该编码。
if companyAlreadyKnown == 0:
for hit in soup2.findAll("h1"):
company_name = hit.text.decode('utf-8')
print "Company Name: " + company_name
flog.write("\nCompany Name: " + pCompanyName)
companyObj.setCompanyName(company_name)