检查 stdout 是否支持 unicode？

Question

如果我想打印 Unicode，我通常会这样做：

print("There are ", end="")
try:
    print(u"\u221E", end="")  # ∞
    unicode_support = True
except UnicodeError:
    print("infinity", end="")
    unicode_support = False
print(" ways to get Unicode wrong.")

if unicode_support:
    print(u"\U0001F440 see you have a Unicode font.")
else:
    print("You do not have Unicode support.")

如果我想 return 来自方法或类似东西的 Unicode 字符串，这将不起作用，因为 Python 将始终理解其中包含 Unicode 的字符串文字，并且仅在以下情况下抛出此错误打印到不支持 Unicode 的东西。我想做这样的事情：

import sys as _sys

UNICODE_SUPPORT = _sys.stdout.unicode_support

def get_heart():
    if UNICODE_SUPPORT:
        return u"\u2665"  # ♥
    return "heart"

print("I{}U".format(get_heart.upper()))

如果当前标准输出支持 Unicode，我希望 sys.stdout.supports_unicode 等价于 True 否则 False.

Answer 1

sys.stdout.encoding

是 None 当没有为它设置编码时，例如当重定向到一个没有特殊预防措施的文件时，例如 print(u'fo\xe0ba') 将失败（尝试使用 ascii 编码失败了）。

已添加：请注意，大多数编码是而非 "universal" -- 每种编码仅支持 Unicode 的子集。 "Supports Unicode" 是一回事； "Supports all of Unicode"（又名 "uses a universal encoding"）是另一个。

UTF-8 是迄今为止最流行的通用编码，尽管您可能偶尔会遇到 UTF-16，甚至 UTF-32（我个人从未遇到过后者 "in the wild":- ).

顺便说一句，即使某个设备支持，例如 utf-8，不也意味着它将在其字体库中具有正确的字形来显示每个代码-可读且明确地指出 - 这是一个非常不同的问题。

Answer 2

这主要是一个 hack，但类似的东西，也许：

 UNICODE_SUPPORT = sys.stdout.encoding in ('UTF-8', 'UTF-16', 'UTF-16LE', 'UTF-16BE', 'UTF-32', 'UTF-32LE', 'UTF32BE')

或者（归功于 Martijn Pieters）：

 UNICODE_SUPPORT = sys.stdout.encoding.lower().startswith('utf')

简单地说，Unicode 是一个巨大的列表，其中包含全世界用于书写语言的所有字符。包括古代语言和许多常见和不常见的符号 (U+1F4A9)。该列表中的每个项目称为 代码点 并由数字标识。

UTF-8、UTF-16 和 UTF-32 是 encoding 专门设计用于能够将 all 代码点编码为序列字节数。 UTF-16和UTF-32是固定大小的多字节编码，既有大端也有小端。

Unicode 被设计为通用，根据定义 UTF-... 以外的任何编码仅支持 Unicode 的一个子集。 cp1252 和 iso-8859-15 这样的编码，（部分）支持 Unicode 的拉丁子集。

检查 stdout 是否支持 unicode？

Check if stdout supports unicode?

python

unicode

stdout