是否有任何包可以从 python 中的文件中读取 table

Is there any package to read table from file in python

我使用 python 包以表格格式将一些数据存储在文件中:tabulate

>>> print tabulate(table, headers, tablefmt="orgtbl")$

table 看起来像:

| name   |   num |
|--------+-------|
| abcd   |    30 |
| efgh   |   100 |
| ijklm  |    10 |

现在我需要将这些数据提供给其他一些程序(用 python 编写)。 有没有简单的方法(我的意思是任何包)将 table 读入某些数据结构而不是显式解析它。换句话说,我可以将 table 打印成其他格式(例如:网格、管道、媒体维基、乳胶)是否有任何现成的解决方案可以将其从任何此类格式读入数据结构?

您可以看看 astropy.io.ascii (previously known as Asciitable),看看它是否能满足您的需求。

The following shows a few of the ASCII formats that are available, while the section on Supported formats contains the full list.

  • Basic: basic table with customizable delimiters and header configurations
  • Cds: CDS format table (also Vizier and ApJ machine readable tables)
  • Daophot: table from the IRAF DAOphot package
  • Ecsv: Enhanced CSV format
  • FixedWidth: table with fixed-width columns (see also Fixed-width Gallery)
  • Ipac: IPAC format table
  • HTML: HTML format table contained in a <table> tag
  • Latex: LaTeX table with datavalue in the tabular environment
  • Rdb: tab-separated values with an extra line after the column definition line
  • SExtractor: SExtractor format table

Is there any easy way(I mean any package) to read the table into some data structure instead of parsing it explicitly.

通过一些努力,csv.reader 可以做到:

from csv import reader

with open('table') as f:
    next(f) # throw away header
    next(f) # throw away |-----+-----|
    for line in reader((l.strip().strip('|') for l in f), delimiter='|'):
        print(line)

输出:

[' abcd   ', '    30 ']
[' efgh   ', '   100 ']
[' ijklm  ', '    10 ']

不完美,但接近。

但是,我认为手动解析它更具可读性:

with open('table') as f:
    next(f) # throw away header
    next(f) # throw away |-----+-----|
    for line in f:
        print(line.strip().strip('|').split('|'))

去掉多余的空格也很简单:

with open('table') as f:
    next(f) # throw away header
    next(f) # throw away |-----+-----|
    for line in f:
        print([scalar.strip() for scalar in line.strip().strip('|').split('|')])

输出:

['abcd', '30']
['efgh', '100']
['ijklm', '10']

就是说,我只会使用 tabulate 来显示数据。对于存储,我会使用 csvjson.