'UCS-2' 编解码器无法对位置 1050-1050 中的字符进行编码
'UCS-2' codec can't encode characters in position 1050-1050
当我 运行 我的 Python 代码时,我得到以下错误:
File "E:\python343\crawler.py", line 31, in <module>
print (x1)
File "E:\python343\lib\idlelib\PyShell.py", line 1347, in write
return self.shell.write(s, self.tags)
UnicodeEncodeError: 'UCS-2' codec can't encode characters in position 1050-1050: Non-BMP character not supported in Tk
这是我的代码:
x = g.request('search', {'q' : 'TaylorSwift', 'type' : 'page', 'limit' : 100})['data'][0]['id']
# GET ALL STATUS POST ON PARTICULAR PAGE(X=PAGE ID)
for x1 in g.get_connections(x, 'feed')['data']:
print (x1)
for x2 in x1:
print (x2)
if(x2[1]=='status'):
x2['message']
我该如何解决这个问题?
您的数据包含 Basic Multilingual Plane 之外的字符。例如,表情符号不在 BMP 范围内,IDLE 使用的 window 系统 Tk 无法处理此类字符。
您可以使用 translation table to map everything outside of the BMP to the replacement character:
import sys
non_bmp_map = dict.fromkeys(range(0x10000, sys.maxunicode + 1), 0xfffd)
print(x.translate(non_bmp_map))
non_bmp_map
映射 BMP 之外的所有代码点(任何高于 0xFFFF 的代码点,一直到 highest Unicode codepoint your Python version can handle) to U+FFFD REPLACEMENT CHARACTER:
>>> print('This works outside IDLE! \U0001F44D')
This works outside IDLE!
>>> print('This works in IDLE too! \U0001F44D'.translate(non_bmp_map))
This works in IDLE too! �
None 这些对我有用,但以下对我有用。这假设 public_tweets 是从 tweepy api.search
中提取的
for tweet in public_tweets:
print (tweet.text)
u=tweet.text
u=u.encode('unicode-escape').decode('utf-8')
此 unicode 问题已在 python 3.6 及更早版本中出现,要解决它只需将 python 升级为 python 3.8 并使用您的 code.This 错误将不会来吧
当我 运行 我的 Python 代码时,我得到以下错误:
File "E:\python343\crawler.py", line 31, in <module>
print (x1)
File "E:\python343\lib\idlelib\PyShell.py", line 1347, in write
return self.shell.write(s, self.tags)
UnicodeEncodeError: 'UCS-2' codec can't encode characters in position 1050-1050: Non-BMP character not supported in Tk
这是我的代码:
x = g.request('search', {'q' : 'TaylorSwift', 'type' : 'page', 'limit' : 100})['data'][0]['id']
# GET ALL STATUS POST ON PARTICULAR PAGE(X=PAGE ID)
for x1 in g.get_connections(x, 'feed')['data']:
print (x1)
for x2 in x1:
print (x2)
if(x2[1]=='status'):
x2['message']
我该如何解决这个问题?
您的数据包含 Basic Multilingual Plane 之外的字符。例如,表情符号不在 BMP 范围内,IDLE 使用的 window 系统 Tk 无法处理此类字符。
您可以使用 translation table to map everything outside of the BMP to the replacement character:
import sys
non_bmp_map = dict.fromkeys(range(0x10000, sys.maxunicode + 1), 0xfffd)
print(x.translate(non_bmp_map))
non_bmp_map
映射 BMP 之外的所有代码点(任何高于 0xFFFF 的代码点,一直到 highest Unicode codepoint your Python version can handle) to U+FFFD REPLACEMENT CHARACTER:
>>> print('This works outside IDLE! \U0001F44D')
This works outside IDLE!
>>> print('This works in IDLE too! \U0001F44D'.translate(non_bmp_map))
This works in IDLE too! �
None 这些对我有用,但以下对我有用。这假设 public_tweets 是从 tweepy api.search
中提取的for tweet in public_tweets:
print (tweet.text)
u=tweet.text
u=u.encode('unicode-escape').decode('utf-8')
此 unicode 问题已在 python 3.6 及更早版本中出现,要解决它只需将 python 升级为 python 3.8 并使用您的 code.This 错误将不会来吧