list(numpy_array) 和 numpy_array.tolist() 的区别

Question

在 numpy 数组上应用 list() 与调用 tolist() 有什么区别？

我正在检查两个输出的类型，它们都显示我得到的结果是 list，但是，输出看起来并不完全相同。是因为 list() 不是 numpy 特定的方法（即可以应用于任何序列）并且 tolist() 是 numpy-特定的，就在这种情况下，他们返回相同的东西？

输入：

points = numpy.random.random((5,2))
print "Points type: " + str(type(points))

输出：

Points type: <type 'numpy.ndarray'>

输入：

points_list = list(points)
print points_list
print "Points_list type: " + str(type(points_list))

输出：

[array([ 0.15920058,  0.60861985]), array([ 0.77414769,  0.15181626]), array([ 0.99826806,  0.96183059]), array([ 0.61830768,  0.20023207]), array([ 0.28422605,  0.94669097])]
Points_list type: 'type 'list''

输入：

points_list_alt = points.tolist()
print points_list_alt
print "Points_list_alt type: " + str(type(points_list_alt))

输出：

[[0.15920057939342847, 0.6086198537462152], [0.7741476852713319, 0.15181626186774055], [0.9982680580550761, 0.9618305944859845], [0.6183076760274226, 0.20023206937408744], [0.28422604852159594, 0.9466909685812506]]

Points_list_alt type: 'type 'list''

Answer 1

您的示例已经显示出差异；考虑以下二维数组：

>>> import numpy as np
>>> a = np.arange(4).reshape(2, 2)
>>> a
array([[0, 1],
       [2, 3]])
>>> a.tolist()
[[0, 1], [2, 3]] # nested vanilla lists
>>> list(a)
[array([0, 1]), array([2, 3])] # list of arrays

tolist 处理完全转换为嵌套香草列表（即 list of list of int），而 list 只是迭代第一个数组的维度，创建数组列表（list of np.array of np.int64）。虽然都是列表：

>>> type(list(a))
<type 'list'>
>>> type(a.tolist())
<type 'list'>

每个列表的元素有不同的类型：

>>> type(list(a)[0])
<type 'numpy.ndarray'>
>>> type(a.tolist()[0])
<type 'list'>

正如您所注意到的，另一个区别是 list 将适用于任何可迭代对象，而 tolist 只能在专门实现该方法的对象上调用。

Answer 2

.tolist() 似乎将所有值递归地转换为 python 原语（list），而 list 从可迭代对象创建一个 python 列表.由于 numpy 数组是 arrays 的数组，list(...) 创建 list 的 arrays

您可以将 list 视为如下所示的函数：

# Not the actually implementation, just for demo purposes
def  list(iterable):
    newlist = []
    for obj in iter(iterable):
        newlist.append(obj)
    return newlist

Answer 3

主要区别在于 tolist 递归地将所有数据转换为 python 标准库类型。

例如：

>>> arr = numpy.arange(2)
>>> [type(item) for item in list(arr)]
[numpy.int64, numpy.int64]
>>> [type(item) for item in arr.tolist()]
[builtins.int, builtins.int]

除了功能差异之外，tolist 通常会更快，因为它知道它有一个 numpy 数组并可以访问支持数组。然而，list 将回退到使用迭代器添加所有元素。

In [2]: arr = numpy.arange(1000)

In [3]: %timeit arr.tolist()
10000 loops, best of 3: 33 µs per loop

In [4]: %timeit list(arr)
10000 loops, best of 3: 80.7 µs per loop

我希望 tolist 是

Answer 4

其他差异：

如果您有一个一维 numpy 数组并使用 tolist() 将其转换为列表，这会将 numpy 标量更改为最接近的兼容内置 python 类型。 相反，list() 没有，它保持了numpy标量的类型。

# list(...)

a = np.uint32([1, 2])

a_list = list(a)   # [1, 2]

type(a_list[0])    # <class 'numpy.uint32'> 


# .tolist()

a_tolist = a.tolist()  # [1, 2]
      
type(a_tolist[0])   # <class 'int'>

如果 numpy 数组的标量值表示 0 维数组，使用 list() 会抛出错误，但 tolist() 只会将其转换为 python 标量而不会出现任何错误。

a = np.array(5)

list(a) # TypeError: iteration over a 0-d array

a.tolist() # 5

list(numpy_array) 和 numpy_array.tolist() 的区别

Difference between list(numpy_array) and numpy_array.tolist()

python

arrays

numpy

list