为什么我无法解码 python2.7 中的“utf8”字符串?
why I can't decode the ‘utf8’ string in python2.7?
我用python写:
'\xF5\x90\x90\x90'.decode('utf8')
但是报错:
UnicodeDecodeError: 'utf8' codec can't decode byte 0xf5 in position 0: invalid start byte
字符串\xF5\x90\x90\x90
是标准的'utf8' 字符串。
它的二进制是11110101 10010000 10010000 10010000
。
遵守utf8规则:11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
为什么我无法解码这个字符串?
来自Wikipedia:
In November 2003, UTF-8 was restricted by RFC 3629 to end at U+10FFFF, in order to match the constraints of the UTF-16 character encoding.
您要解码的字符不在这个范围内。具体是 U+150410.
我用python写:
'\xF5\x90\x90\x90'.decode('utf8')
但是报错:
UnicodeDecodeError: 'utf8' codec can't decode byte 0xf5 in position 0: invalid start byte
字符串\xF5\x90\x90\x90
是标准的'utf8' 字符串。
它的二进制是11110101 10010000 10010000 10010000
。
遵守utf8规则:11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
为什么我无法解码这个字符串?
来自Wikipedia:
In November 2003, UTF-8 was restricted by RFC 3629 to end at U+10FFFF, in order to match the constraints of the UTF-16 character encoding.
您要解码的字符不在这个范围内。具体是 U+150410.