如何用 python 中的特殊字符填充和对齐 unicode 字符串?
How to pad and align unicode strings with special characters in python?
Python 可以轻松填充和对齐 ascii 字符串,如下所示:
>>> print "%20s and stuff" % ("test")
test and stuff
>>> print "{:>20} and stuff".format("test")
test and stuff
但是如何正确填充和对齐包含特殊字符的 unicode 字符串?我尝试了几种方法,但其中 none 似乎有效:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
def manual(data):
for s in data:
size = len(s)
print ' ' * (20 - size) + s + " stuff"
def with_format(data):
for s in data:
print " {:>20} stuff".format(s)
def with_oldstyle(data):
for s in data:
print "%20s stuff" % (s)
if __name__ == "__main__":
data = ("xTest1x", "ツTestツ", "♠️ Test ♠️", "~Test2~")
data_utf8 = map(lambda s: s.decode("utf8"), data)
print "with_format"
with_format(data)
print "with_oldstyle"
with_oldstyle(data)
print "with_oldstyle utf8"
with_oldstyle(data_utf8)
print "manual:"
manual(data)
print "manual utf8:"
manual(data_utf8)
这给出了不同的输出:
with_format
xTest1x stuff
ツTestツ stuff
♠️ Test ♠️ stuff
~Test2~ stuff
with_oldstyle
xTest1x stuff
ツTestツ stuff
♠️ Test ♠️ stuff
~Test2~ stuff
with_oldstyle utf8
xTest1x stuff
ツTestツ stuff
♠️ Test ♠️ stuff
~Test2~ stuff
manual:
xTest1x stuff
ツTestツ stuff
♠️ Test ♠️ stuff
~Test2~ stuff
manual utf8:
xTest1x stuff
ツTestツ stuff
♠️ Test ♠️ stuff
~Test2~ stuff
这是使用 Python 2.7.
有 wcwidth 模块可通过 pip 使用。
test.py:
import wcwidth
def manual_wcwidth(data):
for s in data:
size = wcwidth.wcswidth(s)
print ' ' * (20 - size) + s + " stuff"
data = (u"xTest1x", u"ツTestツ", u"♠️ Test ♠️", u"~Test2~")
manual_wcwidth(data)
在 linux 控制台中,此脚本为我生成了完美对齐的行:
然而,当我 运行 PyCharm 中的脚本时,带有假名的行仍然向左移动一个字符,所以这似乎也取决于字体和渲染器:
Python 可以轻松填充和对齐 ascii 字符串,如下所示:
>>> print "%20s and stuff" % ("test")
test and stuff
>>> print "{:>20} and stuff".format("test")
test and stuff
但是如何正确填充和对齐包含特殊字符的 unicode 字符串?我尝试了几种方法,但其中 none 似乎有效:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
def manual(data):
for s in data:
size = len(s)
print ' ' * (20 - size) + s + " stuff"
def with_format(data):
for s in data:
print " {:>20} stuff".format(s)
def with_oldstyle(data):
for s in data:
print "%20s stuff" % (s)
if __name__ == "__main__":
data = ("xTest1x", "ツTestツ", "♠️ Test ♠️", "~Test2~")
data_utf8 = map(lambda s: s.decode("utf8"), data)
print "with_format"
with_format(data)
print "with_oldstyle"
with_oldstyle(data)
print "with_oldstyle utf8"
with_oldstyle(data_utf8)
print "manual:"
manual(data)
print "manual utf8:"
manual(data_utf8)
这给出了不同的输出:
with_format
xTest1x stuff
ツTestツ stuff
♠️ Test ♠️ stuff
~Test2~ stuff
with_oldstyle
xTest1x stuff
ツTestツ stuff
♠️ Test ♠️ stuff
~Test2~ stuff
with_oldstyle utf8
xTest1x stuff
ツTestツ stuff
♠️ Test ♠️ stuff
~Test2~ stuff
manual:
xTest1x stuff
ツTestツ stuff
♠️ Test ♠️ stuff
~Test2~ stuff
manual utf8:
xTest1x stuff
ツTestツ stuff
♠️ Test ♠️ stuff
~Test2~ stuff
这是使用 Python 2.7.
有 wcwidth 模块可通过 pip 使用。
test.py:
import wcwidth
def manual_wcwidth(data):
for s in data:
size = wcwidth.wcswidth(s)
print ' ' * (20 - size) + s + " stuff"
data = (u"xTest1x", u"ツTestツ", u"♠️ Test ♠️", u"~Test2~")
manual_wcwidth(data)
在 linux 控制台中,此脚本为我生成了完美对齐的行:
然而,当我 运行 PyCharm 中的脚本时,带有假名的行仍然向左移动一个字符,所以这似乎也取决于字体和渲染器: