Python 在有效的 CP437 字符上失败
Python fails on valid CP437 character
向下箭头 (↓
) 是 CP437 编码中的有效字符。我正在编写一个需要使用此编码读写文件的程序,但是当我尝试将包含此字符的字符串写入文件时,我收到以下错误:
File "C:\Python34\lib\encodings\cp437.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_map)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u2193' in position 0: character maps to <undefined>
其他CP437角色也是如此,比如↔
。
我的代码在下面,以防我在那里做了一些愚蠢的事情...
ENCODING = 'CP437'
def writeFile(name, text):
f = open(name, 'w', encoding = ENCODING)
f.write(text)
f.close()
根据维基百科,它在指定的编码中有效,那么为什么 python 告诉我不是这样?我该如何解决这个问题?
对我来说是个谜,但这是否满足您的需求?
f = open('somethin.txt', 'wb')
s1 = ( chr(8595)+chr(8593)+chr(8592)+chr(8594) ) . encode ( 'utf-8' )
s2 = '↓↑←→' . encode ( 'utf-8' )
f.write( s1 )
f.write( s2 )
f.close()
s1 和 s2 是相同的字节串。
您链接到的 Wiki 页面说(就在显示向下箭头 0x19
的 table 上方):
Although the ROM provides a graphic for all 256 different possible 8-bit codes, some APIs will not print some code points, in particular the range 1-31 and the code at 127. Instead they will interpret them as control characters. For instance many methods of outputting text on the original IBM PC would interpret the codes for BEL, BS, CR and LF. Many printers were also unable to print these characters.
您尝试编码的字符,即向下箭头,与 ASCII 控制字符 EM
(媒体结尾)相同。它在旧程序中的含义取决于上下文。在 Python 中,上述引用中提到的字符(1-31 和 127)始终被解释为控制字符,而不是 printable 字符。
向下箭头 (↓
) 是 CP437 编码中的有效字符。我正在编写一个需要使用此编码读写文件的程序,但是当我尝试将包含此字符的字符串写入文件时,我收到以下错误:
File "C:\Python34\lib\encodings\cp437.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_map)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u2193' in position 0: character maps to <undefined>
其他CP437角色也是如此,比如↔
。
我的代码在下面,以防我在那里做了一些愚蠢的事情...
ENCODING = 'CP437'
def writeFile(name, text):
f = open(name, 'w', encoding = ENCODING)
f.write(text)
f.close()
根据维基百科,它在指定的编码中有效,那么为什么 python 告诉我不是这样?我该如何解决这个问题?
对我来说是个谜,但这是否满足您的需求?
f = open('somethin.txt', 'wb')
s1 = ( chr(8595)+chr(8593)+chr(8592)+chr(8594) ) . encode ( 'utf-8' )
s2 = '↓↑←→' . encode ( 'utf-8' )
f.write( s1 )
f.write( s2 )
f.close()
s1 和 s2 是相同的字节串。
您链接到的 Wiki 页面说(就在显示向下箭头 0x19
的 table 上方):
Although the ROM provides a graphic for all 256 different possible 8-bit codes, some APIs will not print some code points, in particular the range 1-31 and the code at 127. Instead they will interpret them as control characters. For instance many methods of outputting text on the original IBM PC would interpret the codes for BEL, BS, CR and LF. Many printers were also unable to print these characters.
您尝试编码的字符,即向下箭头,与 ASCII 控制字符 EM
(媒体结尾)相同。它在旧程序中的含义取决于上下文。在 Python 中,上述引用中提到的字符(1-31 和 127)始终被解释为控制字符,而不是 printable 字符。