读取文本文件以将数据存储在嵌套列表中

Read text file to store data in a nested list

我有以下格式的输入文件。

SOME TEXT SOME TEXT SOME TEXT
SOME TEXT
RETCODE = 0

                        A = 1
                        B = 5
                        C = 3

                        D = 0
                        E = 0

                        D = 4
                        E = 1
                        G = 1


blah blah


---    ENDBLOCK

SOME TEXT
RETCODE = 0
                        A = 3
                        B = 2
                        C = 8
                        D = 6
                        E = 9
                        F = 3



blah blah

blah blah

---    ENDBLOCK

SOME TEXT
RETCODE = 0
                        A = 7
                        B = 2
                        C = 2
                        D = 9
                        E = 0

                        D = 1
                        E = 4

                        D = 7
                        E = 0
                        F = 1

                        D = 1
                        G = 8


blah blah

blah blah

---    ENDBLOCK

它以块的形式显示数据。每个块以 A= some value 开头,块以 一行说 --- ENDBLOCK。每个块有 sub block,每个子块都在空行后开始。

例如,对于第一个块,第一个子块 A=1、B=5、C=3,第二个子块是 D=0、E=0,第三个子块是 D=4、E=1 , G=1.

我的目标有点复杂,是获得一个显示每个子块的 A 到 G 值的列表。如果没有值,则显示为空。下面是所需的输出。

[
    [
        [1,5,3,,,,],[,,,0,0,,],[,,,4,1,,1]
    ],
    [
        [3,2,8,6,9,3,]
    ],
    [
        [7,2,2,9,0,,],[,,,1,4,,],[,,,7,0,1,],[,,,1,,,8]
    ]
]

我当前的代码正在读取文件并在某种程度上存储数据,但我离期望的输出还很远。感谢您的帮助。

f=open("file.txt","r").read().split("ENDBLOCK")
rows = []
for line in f:
    for subline in line.split("\n\n"):
        if " = " in subline and not "RETCODE" in subline:
            rows.append(subline.replace(" ","").split())

>>> rows
[
    ['A=1', 'B=5', 'C=3'], ['D=0', 'E=0'], ['D=4', 'E=1', 'G=1'], 
    ['D=1', 'E=4'], ['D=7', 'E=0', 'F=1'], ['D=1', 'G=8']
]

你可以这样做:

import re

BLOCK_END = "---    ENDBLOCK"
LETTER_TO_VALUE_REGEX = re.compile(r"\s+(?P<letter>[A-G]) = (?P<number>\d*)")
LETTERS = ["A", "B", "C", "D", "E", "F", "G"]
SUB_BLOCK_END = "\n\n"


top_list = []
for block in text.split(BLOCK_END):
    if not block:
        continue
    sub_list = []
    for sub_block in block.split(SUB_BLOCK_END):
        if not sub_block:
            continue
        letters_to_numbers_dict = {}
        for line in sub_block.split("\n"):
            match = LETTER_TO_VALUE_REGEX.match(line)
            if match:
                letters_to_numbers_dict[match["letter"]] = int(match["number"])
        inner_list = []
        for letter in LETTERS:
            inner_list.append(letters_to_numbers_dict.get(letter, ""))
        if set(inner_list) not in [{""}]:
                sub_list.append(inner_list) 
    top_list.append(sub_list)

然后“top_list”是您想要的列表