使用分隔符将大文本文件拆分为多个文件
Split a large text file into multiple files using delimiters
我想使用像 [TEST]
这样的分隔符将一个大文本文件拆分成多个文本文件,如下所示:
texttexttext
texttexttext
texttexttext
[TEST] title1
texttexttext1
texttexttext1
texttexttext1
[TEST] title2
texttexttext2
texttexttext2
texttexttext2
[TEST] title3
texttexttext3
texttexttext3
texttexttext3
应该拆分成多个文本文件:
title1.txt 包含:
texttexttext1
texttexttext1
texttexttext1
title2.txt 包含:
texttexttext2
texttexttext2
texttexttext2
title3.txt 包含:
texttexttext3
texttexttext3
texttexttext3
我该怎么做?
首先你应该阅读更大的 text.txt
文件:
with open('text.txt', 'r+') as text:
contents: list[str] = text.read().split('\n\n') # There are two newlines between the paragraphs, right?
然后将它们放入编号的文件中:
for index, element in enumerate(contents):
with open(f'text{index}.txt', 'w') as file:
file.write(element)
同时读取和写入以避免在内存中保留任何内容的解决方案是:
with open('input.txt') as f:
f_out = None
for line in f:
if line.startswith('[TEST]'): # we need a new output file
title = line.split(' ', 1)[1]
if f_out:
f_out.close()
f_out = open(f'{title}.txt', 'w')
if f_out:
f_out.write(line)
if f_out:
f_out.close()
我想使用像 [TEST]
这样的分隔符将一个大文本文件拆分成多个文本文件,如下所示:
texttexttext
texttexttext
texttexttext
[TEST] title1
texttexttext1
texttexttext1
texttexttext1
[TEST] title2
texttexttext2
texttexttext2
texttexttext2
[TEST] title3
texttexttext3
texttexttext3
texttexttext3
应该拆分成多个文本文件:
title1.txt 包含:
texttexttext1
texttexttext1
texttexttext1
title2.txt 包含:
texttexttext2
texttexttext2
texttexttext2
title3.txt 包含:
texttexttext3
texttexttext3
texttexttext3
我该怎么做?
首先你应该阅读更大的 text.txt
文件:
with open('text.txt', 'r+') as text:
contents: list[str] = text.read().split('\n\n') # There are two newlines between the paragraphs, right?
然后将它们放入编号的文件中:
for index, element in enumerate(contents):
with open(f'text{index}.txt', 'w') as file:
file.write(element)
同时读取和写入以避免在内存中保留任何内容的解决方案是:
with open('input.txt') as f:
f_out = None
for line in f:
if line.startswith('[TEST]'): # we need a new output file
title = line.split(' ', 1)[1]
if f_out:
f_out.close()
f_out = open(f'{title}.txt', 'w')
if f_out:
f_out.write(line)
if f_out:
f_out.close()