Python 2 maketrans() 函数不适用于 Unicode:"the arguments are different lengths" 实际上是
Python 2 maketrans() function doesn't work with Unicode: "the arguments are different lengths" when they actually are
[Python 2]
SUB = string.maketrans("0123456789","₀₁₂₃₄₅₆₇₈₉")
此代码产生错误:
ValueError: maketrans arguments must have same length
我不确定为什么会出现这种情况,因为字符串的长度相同。我唯一的想法是下标文本长度与标准大小的字符有些不同,但我不知道如何解决这个问题。
不对,参数长度不一样:
>>> len("0123456789")
10
>>> len("₀₁₂₃₄₅₆₇₈₉")
30
您正在尝试传入编码数据;我在这里使用了 UTF-8,其中每个数字都编码为 3 个字节。
您不能使用 str.translate()
将 ASCII 字节映射到 UTF-8 字节序列。将您的字符串解码为 unicode
并使用略有不同的 unicode.translate()
方法;它需要一个字典代替:
nummap = {ord(c): ord(t) for c, t in zip(u"0123456789", u"₀₁₂₃₄₅₆₇₈₉")}
这将创建一个映射 Unicode 代码点(整数)的字典,然后您可以在 Unicode 字符串上使用它:
>>> nummap = {ord(c): ord(t) for c, t in zip(u"0123456789", u"₀₁₂₃₄₅₆₇₈₉")}
>>> u'99 bottles of beer on the wall'.translate(nummap)
u'\u2089\u2089 bottles of beer on the wall'
>>> print u'99 bottles of beer on the wall'.translate(nummap)
₉₉ bottles of beer on the wall
如果您愿意,您可以再次将输出编码为 UTF-8。
For Unicode objects, the translate()
method does not accept the optional deletechars argument. Instead, it returns a copy of the s where all characters have been mapped through the given translation table which must be a mapping of Unicode ordinals to Unicode ordinals, Unicode strings or None
. Unmapped characters are left untouched. Characters mapped to None
are deleted.
[Python 2] SUB = string.maketrans("0123456789","₀₁₂₃₄₅₆₇₈₉")
此代码产生错误:
ValueError: maketrans arguments must have same length
我不确定为什么会出现这种情况,因为字符串的长度相同。我唯一的想法是下标文本长度与标准大小的字符有些不同,但我不知道如何解决这个问题。
不对,参数长度不一样:
>>> len("0123456789")
10
>>> len("₀₁₂₃₄₅₆₇₈₉")
30
您正在尝试传入编码数据;我在这里使用了 UTF-8,其中每个数字都编码为 3 个字节。
您不能使用 str.translate()
将 ASCII 字节映射到 UTF-8 字节序列。将您的字符串解码为 unicode
并使用略有不同的 unicode.translate()
方法;它需要一个字典代替:
nummap = {ord(c): ord(t) for c, t in zip(u"0123456789", u"₀₁₂₃₄₅₆₇₈₉")}
这将创建一个映射 Unicode 代码点(整数)的字典,然后您可以在 Unicode 字符串上使用它:
>>> nummap = {ord(c): ord(t) for c, t in zip(u"0123456789", u"₀₁₂₃₄₅₆₇₈₉")}
>>> u'99 bottles of beer on the wall'.translate(nummap)
u'\u2089\u2089 bottles of beer on the wall'
>>> print u'99 bottles of beer on the wall'.translate(nummap)
₉₉ bottles of beer on the wall
如果您愿意,您可以再次将输出编码为 UTF-8。
For Unicode objects, the
translate()
method does not accept the optional deletechars argument. Instead, it returns a copy of the s where all characters have been mapped through the given translation table which must be a mapping of Unicode ordinals to Unicode ordinals, Unicode strings orNone
. Unmapped characters are left untouched. Characters mapped toNone
are deleted.