在 Python 中读取格式化的多行

Question

我想读取 python 中的一些格式化数据。数据格式类似于：

我可以使用正向代码成功模拟 C/C++ 中的读数：

int main()
{
    string hour;
    int x0,y0,z0, x1,y1,z1, x2,y2,z2;

    while(cin >> hour)
    {
        scanf("%d %d %d\n%d %d %d\n%d %d %d\n", &x0, &y0, &z0, &x1, &y1, &z1, &x2, &y2, &z2);
        cout << hour << endl; //check the reading
    }
    return 0;
}

问题是我找不到一些 Python 的方法来读取格式化的多行字符串，就像 scanf 一样简单。 np.genfromtxt 中的一些示例接近了所需的内容，一些示例来自 struct.unpack，但我的技能不足以使其与多行一起工作。我可能可以使用 split() 和一些 readline 来准确获取格式化数据，但它让我发疯，C/C++ 中的程序会比 Python 中的程序更简单。有什么方法可以做类似于 Python 中的 C/C++ 代码的事情吗？

在 Joril 的帮助下得到的答案如下：

from scanf import sscanf
import sys

data = ''
for line in sys.stdin:
    if line != '\n':
        data += line
    else:
        print sscanf(data, "%s\n%d %d %d\n%d %d %d\n%d %d %d\n")
        data = ''

作为输出，我得到了类似的东西：

('00:00:00', 1, 1, 1, 1, 1, 1, 1, 1, 1)
('00:00:02', 3, 3, 3, 3, 3, 3, 3, 3, 3)

Answer 1

那么 Python FAQ 说：

Is there a scanf() or sscanf() equivalent?

Not as such.

For simple input parsing, the easiest approach is usually to split the line into whitespace-delimited words using the split() method of string objects and then convert decimal strings to numeric values using int() or float(). split() supports an optional “sep” parameter which is useful if the line uses something other than whitespace as a separator.

For more complicated input parsing, regular expressions are more powerful than C’s sscanf() and better suited for the task.

但看起来有人制作了一个完全符合您要求的模块：
https://hkn.eecs.berkeley.edu/~dyoo/python/scanf

Answer 2

你绝对可以使用正则表达式。这是 python 中或多或少匹配的代码，没有循环：导入重新

hour = input()
res = re.match(
    r'(?P<hour>\d\d):(?P<minute>\d\d):(?P<second>\d\d)\n'  # \n'
    r'(?P<x0>\d+) (?P<y0>\d+) (?P<z0>\d+)\n'
    r'(?P<x1>\d+) (?P<y1>\d+) (?P<z1>\d+)\n'
    r'(?P<x2>\d+) (?P<y2>\d+) (?P<z2>\d+)',
    hour, re.MULTILINE)

if res:
    print(res.groupdict())

我会先将数据拆分成行，然后再进行解析。

在 Python 中读取格式化的多行

Reading formatted multi-lines in Python

python

scanf

string-formatting