使用列表以块为单位进行数据解析
Data parsing in block units using list
输入:
ID information1
Aa information1-1
Ba information1-2
Ca Homo sapiens
Da information1-4
//
ID information2
Aa information2-1
Ba information2-2
Ca information2-3
Da information2-4
//
预期输出:
ID information1
Aa information1-1
Ba information1-2
Ca Homo sapiens
Da information1-4
//
结果:
ID information1
ID information1
Aa information1-1
ID information1
Aa information1-1
Ba information1-2
ID information1
Aa information1-1
Ba information1-2
Ca Homo sapiens
ID information1
Aa information1-1
Ba information1-2
Ca Homo sapiens
Da information1-4
ID information1
Aa information1-1
Ba information1-2
Ca Homo sapiens
Da information1-4
//
结果:
代码:
word = 'Homo sapiens'
with open(input_file, 'r') as input, open(output_file, 'w') as output:
list_block = []
str_block = ""
for line in input:
if not ("//" in line):
str_block += line
elif "//" in line:
if word in str_block:
list_block.append(str_block)
str_block = ""
output.write(str_block)
我有一个输入文件,其中包含基于 'double slash' 的信息块。我只想从几个块中提取包含 'Homo sapiens' 的块。当我尝试用我的代码解析数据时,我遇到了类似 'Result' 的问题。有什么方法可以处理我的代码吗?
由于您的块由“//”分隔,因此读取整个文件然后根据此模式拆分文件会容易得多。这将创建您需要的块列表,之后解决方案非常简单。这是一个产生所需输出的示例。
word = 'Homo sapiens'
with open(input_file, 'r') as fi, open(output_file, 'w') as fo:
for block in fi.read().split('//'): # read file, split in blocks and iterate over them
if word in block:
fo.write(block)
fo.write('//')
输入:
ID information1
Aa information1-1
Ba information1-2
Ca Homo sapiens
Da information1-4
//
ID information2
Aa information2-1
Ba information2-2
Ca information2-3
Da information2-4
//
预期输出:
ID information1
Aa information1-1
Ba information1-2
Ca Homo sapiens
Da information1-4
//
结果:
ID information1
ID information1
Aa information1-1
ID information1
Aa information1-1
Ba information1-2
ID information1
Aa information1-1
Ba information1-2
Ca Homo sapiens
ID information1
Aa information1-1
Ba information1-2
Ca Homo sapiens
Da information1-4
ID information1
Aa information1-1
Ba information1-2
Ca Homo sapiens
Da information1-4
//
结果:
代码:
word = 'Homo sapiens'
with open(input_file, 'r') as input, open(output_file, 'w') as output:
list_block = []
str_block = ""
for line in input:
if not ("//" in line):
str_block += line
elif "//" in line:
if word in str_block:
list_block.append(str_block)
str_block = ""
output.write(str_block)
我有一个输入文件,其中包含基于 'double slash' 的信息块。我只想从几个块中提取包含 'Homo sapiens' 的块。当我尝试用我的代码解析数据时,我遇到了类似 'Result' 的问题。有什么方法可以处理我的代码吗?
由于您的块由“//”分隔,因此读取整个文件然后根据此模式拆分文件会容易得多。这将创建您需要的块列表,之后解决方案非常简单。这是一个产生所需输出的示例。
word = 'Homo sapiens'
with open(input_file, 'r') as fi, open(output_file, 'w') as fo:
for block in fi.read().split('//'): # read file, split in blocks and iterate over them
if word in block:
fo.write(block)
fo.write('//')