从文本文件中的字符串之间提取信息

Extracting information between strings in a text file

我有一个结构如下的数据文件:

handle:trial1

key_left:3172

key_up:

xcoords:12,12,12,15........

ycoords:200,200,206,210,210......

t:20,140,270,390.....

goalx:2

goaly:12

fractal:images/file.png

seen:true

pauseTimes:

fractal:images/file2.png

seen:False

pauseTimes:

...
...

我只想提取 goaly 行之后到 pauseTimes 行的信息。如果我知道所有试验的 goaly 值,我可以指定该行并在 goaly:pauseTimes 之间提取数据,但我不会提前知道任何 [=] 的值11=] 是因为它们是动态生成的。

如何使用字符串 "goaly" 来标识该行,然后提取所有后续行直到 pauseTimes 行?

extracting = False
with open('path/to/file') as f:
    for line in f:
        if line.startswith('goaly:'):
            extracting = True
        if extracting:
            # I'm not really sure how you want to receive this
            # data, but that's what would go here....
        if line.startswith('pauseTimes:'):
            extracting = False

无论您是否关心线路,您都可以使用状态变量进行循环和跟踪。我喜欢用生成器跟踪这样的解析状态,以使其与处理代码分开。对于您的示例,这是生成器:

def parse(infile):
    returning = False
    trial = None
    for line in infile:
        line = line.rstrip()
        if not line:
            continue

        if line.startswith('handle:'):
            trial = line[len('handle:'):]

        if line.startswith('goaly:'):
            returning = True
        elif line.startswith('pauseTimes:'):
            returning = False

        if returning:
            yield trial, line

下面是您将如何使用它:

for trial, line in parse(open('test.txt', 'r')):
    print(trial, line)

具有跟踪您所处试验的奖励功能。