读取文本文件以将数据存储在嵌套列表中
Read text file to store data in a nested list
我有以下格式的输入文件。
SOME TEXT SOME TEXT SOME TEXT
SOME TEXT
RETCODE = 0
A = 1
B = 5
C = 3
D = 0
E = 0
D = 4
E = 1
G = 1
blah blah
--- ENDBLOCK
SOME TEXT
RETCODE = 0
A = 3
B = 2
C = 8
D = 6
E = 9
F = 3
blah blah
blah blah
--- ENDBLOCK
SOME TEXT
RETCODE = 0
A = 7
B = 2
C = 2
D = 9
E = 0
D = 1
E = 4
D = 7
E = 0
F = 1
D = 1
G = 8
blah blah
blah blah
--- ENDBLOCK
它以块的形式显示数据。每个块以 A= some value
开头,块以
一行说 --- ENDBLOCK
。每个块有 sub block
,每个子块都在空行后开始。
例如,对于第一个块,第一个子块 A=1、B=5、C=3,第二个子块是 D=0、E=0,第三个子块是 D=4、E=1 , G=1.
我的目标有点复杂,是获得一个显示每个子块的 A 到 G 值的列表。如果没有值,则显示为空。下面是所需的输出。
[
[
[1,5,3,,,,],[,,,0,0,,],[,,,4,1,,1]
],
[
[3,2,8,6,9,3,]
],
[
[7,2,2,9,0,,],[,,,1,4,,],[,,,7,0,1,],[,,,1,,,8]
]
]
我当前的代码正在读取文件并在某种程度上存储数据,但我离期望的输出还很远。感谢您的帮助。
f=open("file.txt","r").read().split("ENDBLOCK")
rows = []
for line in f:
for subline in line.split("\n\n"):
if " = " in subline and not "RETCODE" in subline:
rows.append(subline.replace(" ","").split())
>>> rows
[
['A=1', 'B=5', 'C=3'], ['D=0', 'E=0'], ['D=4', 'E=1', 'G=1'],
['D=1', 'E=4'], ['D=7', 'E=0', 'F=1'], ['D=1', 'G=8']
]
你可以这样做:
import re
BLOCK_END = "--- ENDBLOCK"
LETTER_TO_VALUE_REGEX = re.compile(r"\s+(?P<letter>[A-G]) = (?P<number>\d*)")
LETTERS = ["A", "B", "C", "D", "E", "F", "G"]
SUB_BLOCK_END = "\n\n"
top_list = []
for block in text.split(BLOCK_END):
if not block:
continue
sub_list = []
for sub_block in block.split(SUB_BLOCK_END):
if not sub_block:
continue
letters_to_numbers_dict = {}
for line in sub_block.split("\n"):
match = LETTER_TO_VALUE_REGEX.match(line)
if match:
letters_to_numbers_dict[match["letter"]] = int(match["number"])
inner_list = []
for letter in LETTERS:
inner_list.append(letters_to_numbers_dict.get(letter, ""))
if set(inner_list) not in [{""}]:
sub_list.append(inner_list)
top_list.append(sub_list)
然后“top_list”是您想要的列表
我有以下格式的输入文件。
SOME TEXT SOME TEXT SOME TEXT
SOME TEXT
RETCODE = 0
A = 1
B = 5
C = 3
D = 0
E = 0
D = 4
E = 1
G = 1
blah blah
--- ENDBLOCK
SOME TEXT
RETCODE = 0
A = 3
B = 2
C = 8
D = 6
E = 9
F = 3
blah blah
blah blah
--- ENDBLOCK
SOME TEXT
RETCODE = 0
A = 7
B = 2
C = 2
D = 9
E = 0
D = 1
E = 4
D = 7
E = 0
F = 1
D = 1
G = 8
blah blah
blah blah
--- ENDBLOCK
它以块的形式显示数据。每个块以 A= some value
开头,块以
一行说 --- ENDBLOCK
。每个块有 sub block
,每个子块都在空行后开始。
例如,对于第一个块,第一个子块 A=1、B=5、C=3,第二个子块是 D=0、E=0,第三个子块是 D=4、E=1 , G=1.
我的目标有点复杂,是获得一个显示每个子块的 A 到 G 值的列表。如果没有值,则显示为空。下面是所需的输出。
[
[
[1,5,3,,,,],[,,,0,0,,],[,,,4,1,,1]
],
[
[3,2,8,6,9,3,]
],
[
[7,2,2,9,0,,],[,,,1,4,,],[,,,7,0,1,],[,,,1,,,8]
]
]
我当前的代码正在读取文件并在某种程度上存储数据,但我离期望的输出还很远。感谢您的帮助。
f=open("file.txt","r").read().split("ENDBLOCK")
rows = []
for line in f:
for subline in line.split("\n\n"):
if " = " in subline and not "RETCODE" in subline:
rows.append(subline.replace(" ","").split())
>>> rows
[
['A=1', 'B=5', 'C=3'], ['D=0', 'E=0'], ['D=4', 'E=1', 'G=1'],
['D=1', 'E=4'], ['D=7', 'E=0', 'F=1'], ['D=1', 'G=8']
]
你可以这样做:
import re
BLOCK_END = "--- ENDBLOCK"
LETTER_TO_VALUE_REGEX = re.compile(r"\s+(?P<letter>[A-G]) = (?P<number>\d*)")
LETTERS = ["A", "B", "C", "D", "E", "F", "G"]
SUB_BLOCK_END = "\n\n"
top_list = []
for block in text.split(BLOCK_END):
if not block:
continue
sub_list = []
for sub_block in block.split(SUB_BLOCK_END):
if not sub_block:
continue
letters_to_numbers_dict = {}
for line in sub_block.split("\n"):
match = LETTER_TO_VALUE_REGEX.match(line)
if match:
letters_to_numbers_dict[match["letter"]] = int(match["number"])
inner_list = []
for letter in LETTERS:
inner_list.append(letters_to_numbers_dict.get(letter, ""))
if set(inner_list) not in [{""}]:
sub_list.append(inner_list)
top_list.append(sub_list)
然后“top_list”是您想要的列表