无法使用 csv writer python 2.6 将 csv 从 utf-8 转换为 ansi
Cannot convert csv from utf-8 to ansi with csv writer python 2.6
我正在尝试加载一个 utf-8 文本格式的 .csv 文件,并将其写入带有竖线分隔符的 cp1252(ansi) 格式。以下代码适用于 Python 3.6,但我需要它适用于 Python 2.6。但是,'open' 函数不允许在 Python 2.6.
中使用编码关键字
import datetime
import csv
# Define what filenames to read
filenames = ["FILE1","FILE2"]
infilenames = [filename+".csv" for filename in filenames]
outfilenames = [filename+"_out_.csv" for filename in filenames]
# Read filenames in utf-8 and write them in cp1252
for infilename,outfilename in zip(infilenames,outfilenames):
infile = open(infilename, "rt",encoding="utf8")
reader = csv.reader(infile,delimiter=',',quotechar='"',quoting=csv.QUOTE_MINIMAL)
outfile = open(outfilename, "wt",encoding="cp1252")
writer = csv.writer(outfile, delimiter='|', quotechar='"', quoting=csv.QUOTE_NONE,escapechar='\')
for row in reader:
writer.writerow(row)
infile.close()
outfile.close()
我尝试了几种解决方案:
- 未定义编码。导致某些 unicode 字符出错
- 使用 io 库(io.open 而不是打开)。结果 "Type error: cannot write str to text in text stream".
有人知道 Python 2.X 中的正确解决方案吗?
这里可能有一些冗余代码,但我通过执行以下操作使它起作用:
- 首先,我使用 .decode 和 .encode 函数进行编码 "cp1252"。
- 然后我从cp1252编码的文件中读取csv并写入到一个新的csv中
...
import datetime
import csv
# Define what filenames to read
filenames = ["FILE1","FILE2"]
infilenames = [filename+".csv" for filename in filenames]
outfilenames = [filename+"_out_.csv" for filename in filenames]
midfilenames = [filename+"_mid_.csv" for filename in filenames]
# Iterate over each file
for infilename,outfilename,midfilename in zip(infilenames,outfilenames,midfilenames):
# Open file and read utf-8 text, then encode in cp1252
infile = open(infilename, "r")
infilet = infile.read()
infilet = infilet.decode("utf-8")
infilet = infilet.encode("cp1252","ignore")
#write cp1252 encoded file
midfile = open(midfilename,"w")
midfile.write(infilet)
midfile.close()
# read csv with new cp1252 encoding
midfile = open(midfilename,"r")
reader = csv.reader(midfile,delimiter=',', quotechar='"',quoting=csv.QUOTE_MINIMAL)
# define output
outfile = open(outfilename, "w")
writer = csv.writer(outfile, delimiter='|', quotechar='"',quoting=csv.QUOTE_NONE,escapechar='\')
#write output to new csv file
for row in reader:
writer.writerow(row)
print("written file",outfilename)
infile.close()
midfile.close()
outfile.close()
我正在尝试加载一个 utf-8 文本格式的 .csv 文件,并将其写入带有竖线分隔符的 cp1252(ansi) 格式。以下代码适用于 Python 3.6,但我需要它适用于 Python 2.6。但是,'open' 函数不允许在 Python 2.6.
中使用编码关键字import datetime
import csv
# Define what filenames to read
filenames = ["FILE1","FILE2"]
infilenames = [filename+".csv" for filename in filenames]
outfilenames = [filename+"_out_.csv" for filename in filenames]
# Read filenames in utf-8 and write them in cp1252
for infilename,outfilename in zip(infilenames,outfilenames):
infile = open(infilename, "rt",encoding="utf8")
reader = csv.reader(infile,delimiter=',',quotechar='"',quoting=csv.QUOTE_MINIMAL)
outfile = open(outfilename, "wt",encoding="cp1252")
writer = csv.writer(outfile, delimiter='|', quotechar='"', quoting=csv.QUOTE_NONE,escapechar='\')
for row in reader:
writer.writerow(row)
infile.close()
outfile.close()
我尝试了几种解决方案:
- 未定义编码。导致某些 unicode 字符出错
- 使用 io 库(io.open 而不是打开)。结果 "Type error: cannot write str to text in text stream".
有人知道 Python 2.X 中的正确解决方案吗?
这里可能有一些冗余代码,但我通过执行以下操作使它起作用:
- 首先,我使用 .decode 和 .encode 函数进行编码 "cp1252"。
- 然后我从cp1252编码的文件中读取csv并写入到一个新的csv中
...
import datetime
import csv
# Define what filenames to read
filenames = ["FILE1","FILE2"]
infilenames = [filename+".csv" for filename in filenames]
outfilenames = [filename+"_out_.csv" for filename in filenames]
midfilenames = [filename+"_mid_.csv" for filename in filenames]
# Iterate over each file
for infilename,outfilename,midfilename in zip(infilenames,outfilenames,midfilenames):
# Open file and read utf-8 text, then encode in cp1252
infile = open(infilename, "r")
infilet = infile.read()
infilet = infilet.decode("utf-8")
infilet = infilet.encode("cp1252","ignore")
#write cp1252 encoded file
midfile = open(midfilename,"w")
midfile.write(infilet)
midfile.close()
# read csv with new cp1252 encoding
midfile = open(midfilename,"r")
reader = csv.reader(midfile,delimiter=',', quotechar='"',quoting=csv.QUOTE_MINIMAL)
# define output
outfile = open(outfilename, "w")
writer = csv.writer(outfile, delimiter='|', quotechar='"',quoting=csv.QUOTE_NONE,escapechar='\')
#write output to new csv file
for row in reader:
writer.writerow(row)
print("written file",outfilename)
infile.close()
midfile.close()
outfile.close()