StringIO 示例不起作用
StringIO example does not work
我试图了解 numpy.getfromtxt 方法和 io.StringIO 的工作原理。
在官方网站(https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.genfromtxt.html#numpy.genfromtxt)我找到了一些例子。这是其中之一:
s = StringIO("1,1.3,abcde")
data = np.genfromtxt(s, dtype=[('myint','i8'),('myfloat','f8'),('mystring','S5')], delimiter=",")
但是当我在计算机上 运行 这段代码时,我得到:TypeError: must be str or None, not bytes
请告诉我如何解决它?
考虑升级 numpy,因为对于 numpy
的当前版本,您的代码可以正常工作。有关 np.genfromtxt
中的相关更改,请参阅 the mention in 1.14.0 release note highlights and the section Encoding argument for text IO functions。
对于较旧的 numpy,您使用字符串对象作为输入,但您链接的文档说:
Note that generators must return byte strings in Python 3k.
所以按照文档所说的做,并给它一个字节字符串:
import io
s = io.BytesIO(b"1,1.3,abcde")
In [200]: np.__version__
Out[200]: '1.14.0'
这个例子对我有用:
In [201]: s = io.StringIO("1,1.3,abcde")
In [202]: np.genfromtxt(s, dtype=[('myint','i8'),('myfloat','f8'),
...: ... ('mystring','S5')], delimiter=",")
Out[202]:
array((1, 1.3, b'abcde'),
dtype=[('myint', '<i8'), ('myfloat', '<f8'), ('mystring', 'S5')])
它也适用于字节串:
In [204]: s = io.BytesIO(b"1,1.3,abcde")
In [205]: np.genfromtxt(s, dtype=[('myint','i8'),('myfloat','f8'),
...: ... ('mystring','S5')], delimiter=",")
Out[205]:
array((1, 1.3, b'abcde'),
dtype=[('myint', '<i8'), ('myfloat', '<f8'), ('mystring', 'S5')])
genfromtxt
适用于任何为其提供信息的行,因此我通常直接使用字节串列表(在测试问题时):
In [206]: s = [b"1,1.3,abcde"]
In [207]: np.genfromtxt(s, dtype=[('myint','i8'),('myfloat','f8'),
...: ... ('mystring','S5')], delimiter=",")
Out[207]:
array((1, 1.3, b'abcde'),
dtype=[('myint', '<i8'), ('myfloat', '<f8'), ('mystring', 'S5')])
或多行
In [208]: s = b"""1,1.3,abcde
...: 4,1.3,two""".splitlines()
In [209]: s
Out[209]: [b'1,1.3,abcde', b'4,1.3,two']
In [210]: np.genfromtxt(s, dtype=[('myint','i8'),('myfloat','f8'),
...: ... ('mystring','S5')], delimiter=",")
Out[210]:
array([(1, 1.3, b'abcde'), (4, 1.3, b'two')],
dtype=[('myint', '<i8'), ('myfloat', '<f8'), ('mystring', 'S5')])
以前是 dtype=None
,genfromtxt
创建了 S
个字符串。
NumPy dtype issues in genfromtxt(), reads string in as bytestring
在1.14中,我们可以控制默认的字符串dtype:
In [219]: s = io.StringIO("1,1.3,abcde")
In [220]: np.genfromtxt(s, dtype=None, delimiter=",")
/usr/local/bin/ipython3:1: VisibleDeprecationWarning: Reading unicode strings without specifying the encoding argument is deprecated. Set the encoding, use None for the system default.
#!/usr/bin/python3
Out[220]:
array((1, 1.3, b'abcde'),
dtype=[('f0', '<i4'), ('f1', '<f8'), ('f2', 'S5')])
In [221]: s = io.StringIO("1,1.3,abcde")
In [222]: np.genfromtxt(s, dtype=None, delimiter=",",encoding=None)
Out[222]:
array((1, 1.3, 'abcde'),
dtype=[('f0', '<i4'), ('f1', '<f8'), ('f2', '<U5')])
https://docs.scipy.org/doc/numpy/release.html#encoding-argument-for-text-io-functions
现在我可以使用 Py3 字符串生成示例,而不会产生所有那些丑陋的 b'string'
结果(但要记住并不是每个人都升级到 1.14):
In [223]: s = """1,1.3,abcde
...: 4,1.3,two""".splitlines()
In [224]: np.genfromtxt(s, dtype=None, delimiter=",",encoding=None)
Out[224]:
array([(1, 1.3, 'abcde'), (4, 1.3, 'two')],
dtype=[('f0', '<i4'), ('f1', '<f8'), ('f2', '<U5')])
我试图了解 numpy.getfromtxt 方法和 io.StringIO 的工作原理。 在官方网站(https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.genfromtxt.html#numpy.genfromtxt)我找到了一些例子。这是其中之一:
s = StringIO("1,1.3,abcde")
data = np.genfromtxt(s, dtype=[('myint','i8'),('myfloat','f8'),('mystring','S5')], delimiter=",")
但是当我在计算机上 运行 这段代码时,我得到:TypeError: must be str or None, not bytes
请告诉我如何解决它?
考虑升级 numpy,因为对于 numpy
的当前版本,您的代码可以正常工作。有关 np.genfromtxt
中的相关更改,请参阅 the mention in 1.14.0 release note highlights and the section Encoding argument for text IO functions。
对于较旧的 numpy,您使用字符串对象作为输入,但您链接的文档说:
Note that generators must return byte strings in Python 3k.
所以按照文档所说的做,并给它一个字节字符串:
import io
s = io.BytesIO(b"1,1.3,abcde")
In [200]: np.__version__
Out[200]: '1.14.0'
这个例子对我有用:
In [201]: s = io.StringIO("1,1.3,abcde")
In [202]: np.genfromtxt(s, dtype=[('myint','i8'),('myfloat','f8'),
...: ... ('mystring','S5')], delimiter=",")
Out[202]:
array((1, 1.3, b'abcde'),
dtype=[('myint', '<i8'), ('myfloat', '<f8'), ('mystring', 'S5')])
它也适用于字节串:
In [204]: s = io.BytesIO(b"1,1.3,abcde")
In [205]: np.genfromtxt(s, dtype=[('myint','i8'),('myfloat','f8'),
...: ... ('mystring','S5')], delimiter=",")
Out[205]:
array((1, 1.3, b'abcde'),
dtype=[('myint', '<i8'), ('myfloat', '<f8'), ('mystring', 'S5')])
genfromtxt
适用于任何为其提供信息的行,因此我通常直接使用字节串列表(在测试问题时):
In [206]: s = [b"1,1.3,abcde"]
In [207]: np.genfromtxt(s, dtype=[('myint','i8'),('myfloat','f8'),
...: ... ('mystring','S5')], delimiter=",")
Out[207]:
array((1, 1.3, b'abcde'),
dtype=[('myint', '<i8'), ('myfloat', '<f8'), ('mystring', 'S5')])
或多行
In [208]: s = b"""1,1.3,abcde
...: 4,1.3,two""".splitlines()
In [209]: s
Out[209]: [b'1,1.3,abcde', b'4,1.3,two']
In [210]: np.genfromtxt(s, dtype=[('myint','i8'),('myfloat','f8'),
...: ... ('mystring','S5')], delimiter=",")
Out[210]:
array([(1, 1.3, b'abcde'), (4, 1.3, b'two')],
dtype=[('myint', '<i8'), ('myfloat', '<f8'), ('mystring', 'S5')])
以前是 dtype=None
,genfromtxt
创建了 S
个字符串。
NumPy dtype issues in genfromtxt(), reads string in as bytestring
在1.14中,我们可以控制默认的字符串dtype:
In [219]: s = io.StringIO("1,1.3,abcde")
In [220]: np.genfromtxt(s, dtype=None, delimiter=",")
/usr/local/bin/ipython3:1: VisibleDeprecationWarning: Reading unicode strings without specifying the encoding argument is deprecated. Set the encoding, use None for the system default.
#!/usr/bin/python3
Out[220]:
array((1, 1.3, b'abcde'),
dtype=[('f0', '<i4'), ('f1', '<f8'), ('f2', 'S5')])
In [221]: s = io.StringIO("1,1.3,abcde")
In [222]: np.genfromtxt(s, dtype=None, delimiter=",",encoding=None)
Out[222]:
array((1, 1.3, 'abcde'),
dtype=[('f0', '<i4'), ('f1', '<f8'), ('f2', '<U5')])
https://docs.scipy.org/doc/numpy/release.html#encoding-argument-for-text-io-functions
现在我可以使用 Py3 字符串生成示例,而不会产生所有那些丑陋的 b'string'
结果(但要记住并不是每个人都升级到 1.14):
In [223]: s = """1,1.3,abcde
...: 4,1.3,two""".splitlines()
In [224]: np.genfromtxt(s, dtype=None, delimiter=",",encoding=None)
Out[224]:
array([(1, 1.3, 'abcde'), (4, 1.3, 'two')],
dtype=[('f0', '<i4'), ('f1', '<f8'), ('f2', '<U5')])