压缩 scikit-learn 数据集

Question

我正在尝试运行 python 3.3 教程中的以下代码片段：

>>> import numpy as np
>>> from sklearn import datasets
>>> iris = datasets.load_iris()
>>> np.array(zip(iris.data, iris.target))[0:10]

在 2.7 中 returns 输出如下：

array([[array([ 5.1,  3.5,  1.4,  0.2]), 0],
   [array([ 4.9,  3. ,  1.4,  0.2]), 0],
   [array([ 4.7,  3.2,  1.3,  0.2]), 0],
   [array([ 4.6,  3.1,  1.5,  0.2]), 0],
   [array([ 5. ,  3.6,  1.4,  0.2]), 0],
   [array([ 5.4,  3.9,  1.7,  0.4]), 0],
   [array([ 4.6,  3.4,  1.4,  0.3]), 0],
   [array([ 5. ,  3.4,  1.5,  0.2]), 0],
   [array([ 4.4,  2.9,  1.4,  0.2]), 0],
   [array([ 4.9,  3.1,  1.5,  0.1]), 0]], dtype=object)

但在 3.3 中 returns:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: 0-dimensional arrays can't be indexed

我是python的新手，我知道2.x和3.x之间存在差异，我认为这只是与打印功能的差异有关，但我希望能解释一下这里发生了什么，以及我如何在 3.3.运行中得到它。

Answer 1

这里的问题是 zip 现在 returns 是一个可迭代对象而不是一个列表，因此您需要先转换为一个列表：

In [194]:

np.array(list(zip(iris.data, iris.target)))[0:10]
Out[194]:
array([[array([ 5.1,  3.5,  1.4,  0.2]), 0],
       [array([ 4.9,  3. ,  1.4,  0.2]), 0],
       [array([ 4.7,  3.2,  1.3,  0.2]), 0],
       [array([ 4.6,  3.1,  1.5,  0.2]), 0],
       [array([ 5. ,  3.6,  1.4,  0.2]), 0],
       [array([ 5.4,  3.9,  1.7,  0.4]), 0],
       [array([ 4.6,  3.4,  1.4,  0.3]), 0],
       [array([ 5. ,  3.4,  1.5,  0.2]), 0],
       [array([ 4.4,  2.9,  1.4,  0.2]), 0],
       [array([ 4.9,  3.1,  1.5,  0.1]), 0]], dtype=object)

zip 的行为在 python 3 中发生了变化，请注意，当我运行您的代码时，我收到了一个不同的错误：

--------------------------------------------------------------------------- 
IndexError                                Traceback (most recent call last) <ipython-input-193-ec320a0afa3a> in <module>()
          2 from sklearn import datasets
          3 iris = datasets.load_iris()
    ----> 4 np.array(zip(iris.data, iris.target))[0:10]

IndexError: too many indices for array

在 python 3 中发生变化的不仅仅是 print。

压缩 scikit-learn 数据集

Zip scikit-learn datasets

python

python-3.x

scikit-learn