将 csv 转换为 txt(制表符分隔)并遍历目录 python 中的文件
converting csv to txt (tab delimited) and iterate over files in directory python
我有 ~1000 个带有 two-column 数组的文件,这些文件的行数各不相同,扩展名为 .csv。我需要读取文件的每一行,跳过第一个 header 行,然后将所有内容写入一个 tab-delimited .txt 文件。我试着自己在 python 中写这个。
我要换numbers.csv
X,Y
1,2
3,4
...
进入
1[tab]2
3[tab]4
...
我的代码如下。
在 "next(csv_file)" 操作中,我的程序正在读取错误 "StopIteration" infile 的内容被删除。如果我从我的代码中删除这一行,我仍然 over-write in-file 但对 out-file.
什么也不做
有人可以帮我解决这个问题吗?
`
import csv
import os
cwd = os.getcwd()
#print(cwd)
for file in os.listdir():
file_name, file_ext = os.path.splitext(file)
if file_ext == '.csv':
with open(file,'r') as csv_file:
csv_reader = csv.reader(csv_file)
next(csv_file)
for line in csv_reader:
with open(file, 'w') as new_txt: #new file has .txt extension
txt_writer = csv.writer(line, delimiter = '\t') #writefile
txt_writer.writerow(line) #write the lines to file`
In[2]: import csv
In[3]: with open('test_file.txt', 'r') as f:
...: for line in f:
...: print(line)
...:
X,Y
1,2
3,4
In[4]: with open('test_file.txt', 'r') as f:
...: reader = csv.DictReader(f)
...: fieldnames = reader.fieldnames
...: result = list(reader)
...:
...: with open('test_output.tsv', 'w') as f:
...: writer = csv.DictWriter(f, fieldnames=fieldnames, delimiter='\t')
...: writer.writeheader() # remove this line if you don't want header
...: writer.writerows(result)
...:
In[5]: with open('test_output.tsv', 'r') as f:
...: for line in f:
...: print(line)
...:
X Y
1 2
3 4
你走在正确的轨道上。我对您的代码做了一些更改:
import csv
import os
cwd = os.getcwd()
#print(cwd)
for file in os.listdir('.'): # use the directory name here
file_name, file_ext = os.path.splitext(file)
if file_ext == '.csv':
with open(file,'r') as csv_file:
csv_reader = csv.reader(csv_file)
csv_reader.next() ## skip one line (the first one)
newfile = file + '.txt'
for line in csv_reader:
with open(newfile, 'a') as new_txt: #new file has .txt extn
txt_writer = csv.writer(new_txt, delimiter = '\t') #writefile
txt_writer.writerow(line) #write the lines to file`
在写作中,您需要使用 'a'(附加)而不是 'w'(写入),否则,您只会得到一行 - 最后一行。而且,如果最后一行是空白行,那么您将拥有一个包含空白的文件,即什么都没有。
我有 ~1000 个带有 two-column 数组的文件,这些文件的行数各不相同,扩展名为 .csv。我需要读取文件的每一行,跳过第一个 header 行,然后将所有内容写入一个 tab-delimited .txt 文件。我试着自己在 python 中写这个。 我要换numbers.csv
X,Y
1,2
3,4
...
进入
1[tab]2
3[tab]4
...
我的代码如下。
在 "next(csv_file)" 操作中,我的程序正在读取错误 "StopIteration" infile 的内容被删除。如果我从我的代码中删除这一行,我仍然 over-write in-file 但对 out-file.
什么也不做有人可以帮我解决这个问题吗? `
import csv
import os
cwd = os.getcwd()
#print(cwd)
for file in os.listdir():
file_name, file_ext = os.path.splitext(file)
if file_ext == '.csv':
with open(file,'r') as csv_file:
csv_reader = csv.reader(csv_file)
next(csv_file)
for line in csv_reader:
with open(file, 'w') as new_txt: #new file has .txt extension
txt_writer = csv.writer(line, delimiter = '\t') #writefile
txt_writer.writerow(line) #write the lines to file`
In[2]: import csv
In[3]: with open('test_file.txt', 'r') as f:
...: for line in f:
...: print(line)
...:
X,Y
1,2
3,4
In[4]: with open('test_file.txt', 'r') as f:
...: reader = csv.DictReader(f)
...: fieldnames = reader.fieldnames
...: result = list(reader)
...:
...: with open('test_output.tsv', 'w') as f:
...: writer = csv.DictWriter(f, fieldnames=fieldnames, delimiter='\t')
...: writer.writeheader() # remove this line if you don't want header
...: writer.writerows(result)
...:
In[5]: with open('test_output.tsv', 'r') as f:
...: for line in f:
...: print(line)
...:
X Y
1 2
3 4
你走在正确的轨道上。我对您的代码做了一些更改:
import csv
import os
cwd = os.getcwd()
#print(cwd)
for file in os.listdir('.'): # use the directory name here
file_name, file_ext = os.path.splitext(file)
if file_ext == '.csv':
with open(file,'r') as csv_file:
csv_reader = csv.reader(csv_file)
csv_reader.next() ## skip one line (the first one)
newfile = file + '.txt'
for line in csv_reader:
with open(newfile, 'a') as new_txt: #new file has .txt extn
txt_writer = csv.writer(new_txt, delimiter = '\t') #writefile
txt_writer.writerow(line) #write the lines to file`
在写作中,您需要使用 'a'(附加)而不是 'w'(写入),否则,您只会得到一行 - 最后一行。而且,如果最后一行是空白行,那么您将拥有一个包含空白的文件,即什么都没有。