跳转到 for 循环内 readlines 的下一行

Question

我正在编写代码以从非常大的 Source.txt 文件中提取有用的东西。我的源测试文件示例如下：

Test case AAA
Current Parameters:
    Some unique param : 1
    Some unique param : 2
    Some unique param :     3
    Some unique param : 4
*A line of rubbish*
*Another line of rubbish*
*Yet another line of rubbish*
*More and more rubbish*
Test AAA PASS
Test case BBB
Current Parameters:
    Some unique param : A
    Some unique param : B
    Some unique param :     C
    Some unique param : D
*A line of rubbish*
*Another line of rubbish*
*Yet another line of rubbish*
*More and more rubbish*
Test BBB PASS

现在我正在编写代码以仅提取 Test case 和 Current Parameters:

processed = []

def main():
    source_file = open("Source.txt","r") #Open the raw trace file in read mode
    if source_file.mode == "r":
        contents = source_file.readlines()   #Read the contents of the file
        processed_contents = _process_content(contents)
        output_file = open("Output.txt","w")
        output_file.writelines(processed_contents)
        pass

def _process_content(contents):
    for raw_lines in contents:
        if "Test case" in raw_lines:
            processed.append(raw_lines)
        elif "Current Parameters" in raw_lines:
            processed.append(raw_lines)
            #I am stuck here
        elif "PASS" in raw_lines or "FAIL" in raw_lines:
            processed.append(raw_lines)
            processed.append("\n")
    return processed

#def _process_parameters():


if __name__ == '__main__':
    main()

在 Current Parameters 行之后，我想抓取每一个 Some unique param ，它们不会总是相同，并附加到 processed 列表中，这样它也会被记录在我的 Output.txt

我想要的输出是：

Test case AAA
Current Parameters:
    Some unique param : 1
    Some unique param : 2
    Some unique param :     3
    Some unique param : 4
    Test AAA PASS
Test case BBB
Current Parameters:
    Some unique param : A
    Some unique param : B
    Some unique param :     C
    Some unique param : D
    Test BBB PASS

如果你看到了，我想删除所有垃圾行。请注意，我的 Source.txt 中有很多垃圾。我不确定如何从那里转到下一个 raw_lines。感谢您的帮助。

Answer 1

很难确定这是否有效，因为我对垃圾行的格式一无所知，但我认为您可以检查该行是否包含 "Param"，就像您对其他行所做的一样：

def _process_content(contents):
    for raw_line in contents:
        if "Test case" in raw_line:
            processed.append(raw_line)
        elif "Current Parameters" in raw_line:
            processed.append(raw_line)
        elif "Param" in raw_line:
            processed.append(raw_line)
        elif "PASS" in raw_line or "FAIL" in raw_lines:
            processed.append(raw_line)
            processed.append("\n")
    return processed

Answer 2

这是一种使用 Regex 的方法。

例如：

import re

result = []
with open(filename) as infile:
    for raw_lines in infile:
        if "Test case" in raw_lines:
            result.append(raw_lines)
        if "Current Parameters" in raw_lines:
            result.append(raw_lines)
            raw_lines = next(infile)                        #next() to move to next line. 
            while True:
                m = re.search(r"(?P<params>\s*\w+\s*:\s*\w+\s*)", raw_lines)    
                if not m:
                    break
                result.append(m.group("params"))
                raw_lines = next(infile)
        if "PASS" in raw_lines or "FAIL" in raw_lines:
            result.append(raw_lines)
            result.append("\n")
print(result)

输出：

['Test case AAA\n',
 'Current Parameters:\n',
 ' param : 1\n',
 ' param : 2\n',
 ' param :     3\n',
 ' param : 4\n',
 'Test AAA PASS\n',
 '\n',
 'Test case BBB\n',
 'Current Parameters:\n',
 ' param : A\n',
 ' param : B\n',
 ' param :     C\n',
 ' param : D\n',
 'Test BBB PASS',
 '\n']

Answer 3

您可以使用str.startswith()过滤掉您想要的行，然后将这些行重新写入文件。我还将在 ":" 上拆分行，并检查长度为 2 的 idd 以查找参数。将行转换为全部小写也是安全的，因此您可以进行无大小写匹配，因此它认为 "Test" 与 "test" 不同。

演示：

lines = []
with open("source.txt") as f:
    for line in f:
        lowercase = line.lower()
        if (
            lowercase.startswith("test")
            or lowercase.startswith("current parameters:")
            or len(lowercase.split(":")) == 2
        ):
            lines.append(line)

with open("source.txt", mode="w") as o:
    for line in lines:
        o.write(line)

source.txt:

Test case AAA
Current Parameters:
    Some unique param : 1
    Some unique param : 2
    Some unique param :     3
    Some unique param : 4
Test AAA PASS
Test case BBB
Current Parameters:
    Some unique param : A
    Some unique param : B
    Some unique param :     C
    Some unique param : D
Test BBB PASS

Answer 4

您可以使用正则表达式反向引用（例如</code>）来拆分测试用例（<a href="https://regex101.com/r/rMwXEj/3" rel="nofollow noreferrer">regex101</a>）：</p> <pre><code>import re data = '''Test case AAA Current Parameters: Some unique param : 1 Some unique param : 2 Some unique param : 3 Some unique param : 4 *A line of rubbish* *Another line of rubbish* *Yet another line of rubbish* *More and more rubbish* Test AAA PASS Test case BBB Current Parameters: Some unique param : A Some unique param : B Some unique param : C Some unique param : D *A line of rubbish* *Another line of rubbish* *Yet another line of rubbish* *More and more rubbish* Test BBB PASS''' for g in re.findall(r'(^Test case ([A-Za-z]+)\s+Current Parameters:(?:[^:]+:.*?$)*)+.*?(Test (PASS|FAIL))', data, flags=re.DOTALL|re.M): print(g[0]) print(g[2])

打印：

Test case AAA
Current Parameters:
    Some unique param : 1
    Some unique param : 2
    Some unique param :     3
    Some unique param : 4
Test AAA PASS
Test case BBB
Current Parameters:
    Some unique param : A
    Some unique param : B
    Some unique param :     C
    Some unique param : D
Test BBB PASS

跳转到 for 循环内 readlines 的下一行

Jump into the next line in readlines inside for loop

python

readlines