unicodedata.normalize 缺少一个字符进行转换
unicodedata.normalize is missing one character doing conversion
我正在尝试使用以下脚本重命名文件,但我在捕获以下 "Don’t" 时遇到了问题,它最终应该是 "Don't"。关于如何执行此操作的任何想法?
def remove_accents(s):
nkfd_form = unicodedata.normalize('NFKD', s)
return u''.join([c for c in nkfd_form if not unicodedata.combining(c)])
for fname in glob.glob("**/*.mp3", recursive=True):
new_fname = remove_accents(fname)
if new_fname != fname:
try:
print ('renaming non-ascii filename to', new_fname)
os.rename(fname, new_fname)
except Exception as e:
print (e)
错误的工具 - unicodedata.normalize
根本不是要删除重音符号。
要将 down-converting 转换为 ascii,请查看 unidecode
:
>>> from unidecode import unidecode
>>> unidecode("Don’t")
"Don't"
我正在尝试使用以下脚本重命名文件,但我在捕获以下 "Don’t" 时遇到了问题,它最终应该是 "Don't"。关于如何执行此操作的任何想法?
def remove_accents(s):
nkfd_form = unicodedata.normalize('NFKD', s)
return u''.join([c for c in nkfd_form if not unicodedata.combining(c)])
for fname in glob.glob("**/*.mp3", recursive=True):
new_fname = remove_accents(fname)
if new_fname != fname:
try:
print ('renaming non-ascii filename to', new_fname)
os.rename(fname, new_fname)
except Exception as e:
print (e)
错误的工具 - unicodedata.normalize
根本不是要删除重音符号。
要将 down-converting 转换为 ascii,请查看 unidecode
:
>>> from unidecode import unidecode
>>> unidecode("Don’t")
"Don't"