如何将单个 csv 中的不同块读取到不同的数组中？

Question

我的文本文件由不同的“块”组成，例如

0 0 1
1 1 1
1 0 0

1 0 0 1
1 1 1 1
1 1 1 0

1 0 0 1
1 1 1 1
1 1 1 0
1 0 1 0

1 0 0
0 1 1
1 1 1

1 0 0 0 0 1
1 1 1 1 0 0
1 1 0 0 1 0
1 0 1 0 0 0

我想读取 np 数组中的每个块。我没有找到 np.loadtxt() 的参数以在空行中读取。我想在 f = open('test_case_11x5.txt', 'r') for line in f: ... 施加条件很慢。

有谁知道一个巧妙的方法吗？

Answer 1

您可以像这样在 itertools 中使用 groupby 函数：

from itertools import groupby
import numpy as np

arr = []
with open('data.txt') as f_data:    
    for k, g in groupby(f_data, lambda x: x.startswith('#')):
        if not k:
            arr.append(np.array([[int(x) for x in d.split()] for d in g if len(d.strip())]))

这将产生一个 np 数组列表。

Answer 2

这是一个使用 re.split 和一个小列表理解的工作解决方案。我假设全文首先加载到变量 text:

import re, io
import numpy as np

# text = ... ## load here your file

[np.loadtxt(io.StringIO(t)) for t in re.split('\n\n', text)]

输出：

[array([[0., 0., 1.],
        [1., 1., 1.],
        [1., 0., 0.]]),
 array([[1., 0., 0., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 0.]]),
 array([[1., 0., 0., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 0.],
        [1., 0., 1., 0.]]),
 array([[1., 0., 0.],
        [0., 1., 1.],
        [1., 1., 1.]]),
 array([[1., 0., 0., 0., 0., 1.],
        [1., 1., 1., 1., 0., 0.],
        [1., 1., 0., 0., 1., 0.],
        [1., 0., 1., 0., 0., 0.]])]

如何将单个 csv 中的不同块读取到不同的数组中？

How to read different blocks from a single csv into different arrays?

python

csv

numpy-ndarray