numpy 中 genfromtxt 的注释参数

Question

我正在学习 numpy 中 genfromtxt 的 I/O 函数。我尝试了 numpy 用户指南中的一个示例。关于genfromtxt的comments参数。

以下是 numpy 用户指南中的示例：

>>> data = """#
... # Skip me !
... # Skip me too !
... 1, 2
... 3, 4
... 5, 6 #This is the third line of the data
... 7, 8
... # And here comes the last line
... 9, 0
... """
>>> np.genfromtxt(StringIO(data), comments="#", delimiter=",")
[[ 1. 2.]
[ 3. 4.]
[ 5. 6.]
[ 7. 8.]
[ 9. 0.]]

我在下面试过：

data = """#                 \
    # Skip me !         \
    # Skip me too !     \
    1, 2                \
    3, 4                \
    5, 6 #This is the third line of the data    \
    7, 8                \
    # And here comes the last line  \
    9, 0                \
    """
a = np.genfromtxt(io.BytesIO(data.encode()), comments = "#", delimiter = ",")
print (a)

结果出来了：

genfromtxt：空输入文件：“<_io.BytesIO object at 0x0000020555DC5EB8>” warnings.warn('genfromtxt: Empty input file: "%s"' % fname)

我知道问题出在数据上。谁能教我如何设置示例中所示的数据？非常感谢。

Answer 1

试试下面的方法。首先，不要使用 "\"。其次，为什么要使用 .BytesIO() 使用 StringIO()

import numpy as np
from StringIO import StringIO

data = """#                 
    # Skip me !     
    # Skip me too !     
    1, 2                
    3, 4                
    5, 6 #This is the third line of the data    
    7, 8                
    # And here comes the last line  
    9, 0                
    """

    np.genfromtxt(StringIO(data), comments="#", delimiter=",")

    array([[ 1.,  2.],
           [ 3.,  4.],
           [ 5.,  6.],
           [ 7.,  8.],
           [ 9.,  0.]])

Answer 2

在 ipython3 (py3) 交互式会话中我可以做：

In [326]: data = b"""#
     ...: ... # Skip me !
     ...: ... # Skip me too !
     ...: ... 1, 2
     ...: ... 3, 4
     ...: ... 5, 6 #This is the third line of the data
     ...: ... 7, 8
     ...: ... # And here comes the last line
     ...: ... 9, 0
     ...: ... """
In [327]: 
In [327]: data
Out[327]: b'#\n# Skip me !\n# Skip me too !\n1, 2\n3, 4\n5, 6 #This is the third line of the data\n7, 8\n# And here comes the last line\n9, 0\n'
In [328]: np.genfromtxt(data.splitlines(),comments='#', delimiter=',')
Out[328]: 
array([[ 1.,  2.],
       [ 3.,  4.],
       [ 5.,  6.],
       [ 7.,  8.],
       [ 9.,  0.]])

在Python3中，字符串需要是字节；在 Py2 中这是默认值。

对于多行字符串输入（三引号）不要使用 \。那是一条线的延续。你想保留 \n

data = b"""
one
two
"""

注意我也可以使用：

data = '#\n# Skip me\n...'

显式 \n.

genfromtxt 适用于任何给它行的迭代。所以我给了它一个行列表——用分割线生成。 StringIO（或 Py3 中的 ByteIO）也有效，但需要额外的工作。

当然，另一种选择是将这些行复制到文本编辑器并将它们保存为一个简单的文本文件。复制粘贴到交互式会话是一个方便的快捷方式，但不是必需的。

In [329]: data.splitlines()
Out[329]: 
[b'#',
 b'# Skip me !',
 b'# Skip me too !',
 b'1, 2',
 b'3, 4',
 b'5, 6 #This is the third line of the data',
 b'7, 8',
 b'# And here comes the last line',
 b'9, 0']

numpy 中 genfromtxt 的注释参数

The comments argument of genfromtxt in numpy

python

numpy

python-3.x

genfromtxt