使用 numpy 的 StringIO 模块 python 中的错误

Question

非常简单的代码：

import StringIO
import numpy as np
c = StringIO.StringIO()
c.write("1 0")
a = np.loadtxt(c)
print a

我得到一个空数组 + 警告说 c 是一个空文件。

我通过添加解决了这个问题：

d=StringIO.StringIO(c.getvalue())
a = np.loadtxt(d)

我觉得这样的事情不应该发生，这是怎么回事？

Answer 1

因为文件对象的'position'写入后在文件末尾。所以numpy读取的时候，是从文件末尾读到末尾，这没什么。

查找到文件的开头，然后就可以了：

>>> from StringIO import StringIO
>>> s = StringIO()
>>> s.write("1 2")
>>> s.read()
''
>>> s.seek(0)
>>> s.read()
'1 2'

Answer 2

StringIO 是一个类似文件的对象。因此它具有与文件一致的行为。有一个文件指针的概念——文件中的当前位置。当您将数据写入 StringIO 对象时，文件指针会调整到数据的末尾。当你试图读取它时，文件指针已经在缓冲区的末尾，所以没有数据返回。

要回读它，您可以执行以下两项操作之一：

使用您已经发现的 StringIO.getvalue()。这个returns 从缓冲区开始的数据，保持文件指针不变。
使用StringIO.seek(0)将文件指针重新定位到缓冲区，然后调用 StringIO.read() 读取数据。

演示

>>> from StringIO import StringIO

>>> s = StringIO()
>>> s.write('hi there')
>>> s.read()
''
>>> s.tell()    # shows the current position of the file pointer
8
>>> s.getvalue()
'hi there'
>>> s.tell()
8
>>> s.read()
''
>>> s.seek(0)
>>> s.tell()
0
>>> s.read()
'hi there'
>>> s.tell()
8
>>> s.read()
''

有一个例外。如果您在创建 StringIO 时提供一个值，缓冲区将使用该值初始化，但文件指针将定位在缓冲区的开头：

>>> s = StringIO('hi there')
>>> s.tell()
0
>>> s.read()
'hi there'
>>> s.read()
''
>>> s.tell()
8

这就是为什么它在您使用

时有效

d=StringIO.StringIO(c.getvalue())

因为您在创建时初始化 StringIO 对象，并且文件指针位于缓冲区的开头。

使用 numpy 的 StringIO 模块 python 中的错误

Bug in StringIO module python using numpy

python

stringio