在 Python 中更改文件编码方案
Change file encoding scheme in Python
我正在尝试使用 latin-1 编码打开文件,以便生成具有不同编码的文件。我收到 NameError
说明 unicode is not defined
。这是我使用的一段代码:
sourceEncoding = "latin-1"
targetEncoding = "utf-8"
source = open(r'C:\Users\chsafouane\Desktop\saf.txt')
target = open(r'C:\Users\chsafouane\Desktop\saf2.txt', "w")
target.write(unicode(source.read(), sourceEncoding).encode(targetEncoding))
我根本不习惯处理文件,所以我不知道是否有我应该导入的模块来使用"unicode"
您看到 unicode not defined
这一事实表明您在 Python3。这是一个代码片段,它会生成一个 latin1 编码的文件,然后做你想做的,吞下 latin1 编码的文件并吐出一个 UTF8 编码的文件:
# Generate a latin1-encoded file
txt = u'U+00AxNBSP¡¢£¤¥¦§¨©ª«¬SHY®¯U+00Bx°±²³´µ¶·¸¹º»¼½¾¿U+00CxÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏU+00DxÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞßU+00ExàáâãäåæçèéêëìíîïU+00Fxðñòóôõö÷øùúûüýþÿ'
latin1 = txt.encode('latin1')
with open('example-latin1.txt', 'wb') as fid:
fid.write(latin1)
# Read in the latin1 file
with open('example-latin1.txt', 'r', encoding='latin1') as fid:
contents = fid.read()
assert contents == latin1.decode('latin1') # sanity check
# Spit out a UTF8-encoded file
with open('converted-utf8.txt', 'w') as fid:
fid.write(contents)
如果您希望输出为 UTF8 以外的格式,请将 encoding
参数添加到 open
,例如,
with open('converted-utf_32.txt', 'w', encoding='utf_32') as fid:
fid.write(contents)
我正在尝试使用 latin-1 编码打开文件,以便生成具有不同编码的文件。我收到 NameError
说明 unicode is not defined
。这是我使用的一段代码:
sourceEncoding = "latin-1"
targetEncoding = "utf-8"
source = open(r'C:\Users\chsafouane\Desktop\saf.txt')
target = open(r'C:\Users\chsafouane\Desktop\saf2.txt', "w")
target.write(unicode(source.read(), sourceEncoding).encode(targetEncoding))
我根本不习惯处理文件,所以我不知道是否有我应该导入的模块来使用"unicode"
您看到 unicode not defined
这一事实表明您在 Python3。这是一个代码片段,它会生成一个 latin1 编码的文件,然后做你想做的,吞下 latin1 编码的文件并吐出一个 UTF8 编码的文件:
# Generate a latin1-encoded file
txt = u'U+00AxNBSP¡¢£¤¥¦§¨©ª«¬SHY®¯U+00Bx°±²³´µ¶·¸¹º»¼½¾¿U+00CxÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏU+00DxÐÑÒÓÔÕÖ×ØÙÚÛÜÝÞßU+00ExàáâãäåæçèéêëìíîïU+00Fxðñòóôõö÷øùúûüýþÿ'
latin1 = txt.encode('latin1')
with open('example-latin1.txt', 'wb') as fid:
fid.write(latin1)
# Read in the latin1 file
with open('example-latin1.txt', 'r', encoding='latin1') as fid:
contents = fid.read()
assert contents == latin1.decode('latin1') # sanity check
# Spit out a UTF8-encoded file
with open('converted-utf8.txt', 'w') as fid:
fid.write(contents)
如果您希望输出为 UTF8 以外的格式,请将 encoding
参数添加到 open
,例如,
with open('converted-utf_32.txt', 'w', encoding='utf_32') as fid:
fid.write(contents)