使用 writerow 在 python 错误中合并多个具有不同列的 csv 文件
Merge multiple csv files with different columns in python error with writerow
我有大量的 csv files/dataframes,它们太大而无法一起存储在内存中。但是,我注意到这些数据帧之间的列大小不同。我的专栏是 "ACGT"(DNA 序列)的排列。我按照 this question on how to write multiple csvs with different columns, however I get the following error: AttributeError: 'str' object has no attribute 'keys'. I found this question to address the error, however I am unsure where to edit the code to make the 'line' object a dictionary. I am also worried my csv files which have an index column without a header value may be messing up my code or the format of my fieldnames (str derived from permutations) may be an issue. If there is a way to concat multiple csv files with different in another language I am amendable to that however I have run into issues with this question as well 的说明进行操作。
import glob
import csv
import os
mydir = "test_csv/"
file_list = glob.glob(mydir + "/*.csv") # Include slash or it will search in the wrong directory!!
file_list
import itertools
fieldnames = []
for p in itertools.product('ACGT', repeat=8):
fieldnames.append("".join(p))
for filename in file_list:
with open(filename, "r", newline="") as f_in:
reader = csv.reader(f_in)
headers = next(reader)
with open("Outcombined.csv", "w", newline="") as f_out:
writer = csv.DictWriter(f_out, fieldnames=fieldnames)
for filename in file_list:
with open(filename, "r", newline="") as f_in:
reader = csv.DictReader(f_in)
for line in headers:
writer.writerow(line)
你只需要写一次 header,所以在你的 file_list
循环之前写:
with open('Outcombined.csv','w',newline='') as f_out:
writer = csv.DictWriter(f_out,fieldnames=fieldnames)
writer.writeheader() # write header based on `fieldnames`
for filename in file_list:
with open(filename,'r',newline='') as f_in:
reader = csv.DictReader(f_in)
for line in reader:
writer.writerow(line)
DictWriter
将负责将值置于正确的 header 之下。
我有大量的 csv files/dataframes,它们太大而无法一起存储在内存中。但是,我注意到这些数据帧之间的列大小不同。我的专栏是 "ACGT"(DNA 序列)的排列。我按照 this question on how to write multiple csvs with different columns, however I get the following error: AttributeError: 'str' object has no attribute 'keys'. I found this question to address the error, however I am unsure where to edit the code to make the 'line' object a dictionary. I am also worried my csv files which have an index column without a header value may be messing up my code or the format of my fieldnames (str derived from permutations) may be an issue. If there is a way to concat multiple csv files with different in another language I am amendable to that however I have run into issues with this question as well 的说明进行操作。
import glob
import csv
import os
mydir = "test_csv/"
file_list = glob.glob(mydir + "/*.csv") # Include slash or it will search in the wrong directory!!
file_list
import itertools
fieldnames = []
for p in itertools.product('ACGT', repeat=8):
fieldnames.append("".join(p))
for filename in file_list:
with open(filename, "r", newline="") as f_in:
reader = csv.reader(f_in)
headers = next(reader)
with open("Outcombined.csv", "w", newline="") as f_out:
writer = csv.DictWriter(f_out, fieldnames=fieldnames)
for filename in file_list:
with open(filename, "r", newline="") as f_in:
reader = csv.DictReader(f_in)
for line in headers:
writer.writerow(line)
你只需要写一次 header,所以在你的 file_list
循环之前写:
with open('Outcombined.csv','w',newline='') as f_out:
writer = csv.DictWriter(f_out,fieldnames=fieldnames)
writer.writeheader() # write header based on `fieldnames`
for filename in file_list:
with open(filename,'r',newline='') as f_in:
reader = csv.DictReader(f_in)
for line in reader:
writer.writerow(line)
DictWriter
将负责将值置于正确的 header 之下。