只读取文本文件的特定部分并将它们输出到不同的文本文件
Reading only specific portion of a text file and output them to diffrent text files
我有一个很大的文本文件,其中包含以下内容:
158 lines of Text
2000 lines of Data
140 lines of Text
2000 lines of Data
140 lines of Text
.
.
.
总共有 5 组 2000 行数据,我想 python 读取和写入 5 个不同的文本文件。
像这样:
Data1.txt
Data2.txt
Data3.txt
.
.
网上浏览发现如下:reading sections from a large text file in python efficiently
def get_block(beg,end):
output=open("Output.txt",'a')
with open("input.txt",'r') as f:
for line in f:
line=line.strip("\r\n")
line=line.split("\t")
position=str(line[0])
if int(position)<=beg:
pass
elif int(position)>=end:
break
else:
for i in line:
output.write(("%s\t")%(i))
output.write("\n")
哪个问题和我的类似,但是,这个函数我得到以下错误:
File "/Users/aperego/Desktop/HexaPaper/DataToPlot/ReadThermo.py", line 8, in get_block
if int(position)<=beg:
ValueError: invalid literal for int() with base 10: 'LAMMPS (5 Jun 2019)'
我认为这是因为我的输入文本文件在数据集之间有很多文本行。此外,它只接受一个行间隔,而我希望我的脚本 运行 并一次提取所有包含数据的行。
我不知道修改这个脚本是否是解决这个问题的最佳方法,或者是否有更好的方法来实现我想要的目标。感谢任何帮助!
如果您知道要跳过多少行以及要阅读多少行,则使用 for
-loop with next()
跳过行,readline()
阅读行
# fin - file input
# fout - file output
fin = open('input.txt')
# skip 158 lines
for _ in range(158):
next(fin)
# write 2000 lines
with open('Data1.txt', 'w') as fout:
for _ in range(2000):
fout.write(fin.readline())
# skip 140 lines
for _ in range(140):
next(fin)
# write 2000 lines
with open('Data2.txt', 'w') as fout:
for _ in range(2000):
fout.write(fin.readline())
# ... rest ...
fin.close()
您也可以将其减少到
fin = open('test.txt')
# skip 158 lines
for _ in range(158):
next(fin)
# write 2000 lines
with open('Data1.txt', 'w') as fout:
for _ in range(2000):
fout.write(fin.readline())
# --- the same number of lines to skip
for x in range(2, 5):
filename = 'Data{}.txt'.format(x)
# skip 140 lines
for _ in range(140):
next(fin)
# write 2000 lines
with open(filename, 'w') as fout:
for _ in range(2000):
fout.write(fin.readline())
fin.close()
我有一个很大的文本文件,其中包含以下内容:
158 lines of Text
2000 lines of Data
140 lines of Text
2000 lines of Data
140 lines of Text
.
.
.
总共有 5 组 2000 行数据,我想 python 读取和写入 5 个不同的文本文件。 像这样:
Data1.txt
Data2.txt
Data3.txt
.
.
网上浏览发现如下:reading sections from a large text file in python efficiently
def get_block(beg,end):
output=open("Output.txt",'a')
with open("input.txt",'r') as f:
for line in f:
line=line.strip("\r\n")
line=line.split("\t")
position=str(line[0])
if int(position)<=beg:
pass
elif int(position)>=end:
break
else:
for i in line:
output.write(("%s\t")%(i))
output.write("\n")
哪个问题和我的类似,但是,这个函数我得到以下错误:
File "/Users/aperego/Desktop/HexaPaper/DataToPlot/ReadThermo.py", line 8, in get_block
if int(position)<=beg:
ValueError: invalid literal for int() with base 10: 'LAMMPS (5 Jun 2019)'
我认为这是因为我的输入文本文件在数据集之间有很多文本行。此外,它只接受一个行间隔,而我希望我的脚本 运行 并一次提取所有包含数据的行。
我不知道修改这个脚本是否是解决这个问题的最佳方法,或者是否有更好的方法来实现我想要的目标。感谢任何帮助!
如果您知道要跳过多少行以及要阅读多少行,则使用 for
-loop with next()
跳过行,readline()
阅读行
# fin - file input
# fout - file output
fin = open('input.txt')
# skip 158 lines
for _ in range(158):
next(fin)
# write 2000 lines
with open('Data1.txt', 'w') as fout:
for _ in range(2000):
fout.write(fin.readline())
# skip 140 lines
for _ in range(140):
next(fin)
# write 2000 lines
with open('Data2.txt', 'w') as fout:
for _ in range(2000):
fout.write(fin.readline())
# ... rest ...
fin.close()
您也可以将其减少到
fin = open('test.txt')
# skip 158 lines
for _ in range(158):
next(fin)
# write 2000 lines
with open('Data1.txt', 'w') as fout:
for _ in range(2000):
fout.write(fin.readline())
# --- the same number of lines to skip
for x in range(2, 5):
filename = 'Data{}.txt'.format(x)
# skip 140 lines
for _ in range(140):
next(fin)
# write 2000 lines
with open(filename, 'w') as fout:
for _ in range(2000):
fout.write(fin.readline())
fin.close()