为什么 Python 不从 Latin Extended-A 写入字符（写入文件时出现 UnicodeEncodeError）？

Question

强制性介绍，指出我已经做了一些研究

这看起来应该很简单（如果找到合适的目标问题，我很乐意将其作为重复项关闭），但我对字符编码以及 Python 如何处理它们还不够熟悉我自己弄明白了。冒着看起来很懒惰的风险，我会很好地注意到答案可能在下面的链接之一中，但我还没有在阅读中看到它。

我参考了一些文档：Unicode HOWTO, codecs.py docs

我还查看了一些旧的投票率很高的 SO 问题：Writing Unicode text to a text file?, Python, Unicode, and the Windows console

问题

这里有一个 MCVE 代码示例来演示我的问题：

with open('foo.txt', 'wt') as outfile:
    outfile.write('\u014d')

回溯如下：

Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
  File "C:\Users\cashamerica\AppData\Local\Programs\Python\Python3\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u014d' in position 0: character maps to <undefined>

我很困惑，因为代码点 U+014D 是 'ō'，一个分配的代码点，LATIN SMALL LETTER O WITH MACRON (official Unicode source)

我什至可以将字符打印到 Windows 控制台（但它呈现为正常 'o'）：

>>> print('\u014d')
o

Answer 1

您正在使用 cp1252 作为默认编码，其中不包括 ō。

使用显式编码写入（和读取）您的文件：

with open('foo.txt', 'wt', encoding='utf8') as outfile:
    outfile.write('\u014d')

为什么 Python 不从 Latin Extended-A 写入字符（写入文件时出现 UnicodeEncodeError）？

Why isn't Python writing characters from Latin Extended-A (UnicodeEncodeError when writing to a file)?

python

unicode

codec

python-3.x