命名数组中 numpy 高级切片的意外行为

Question

当使用 numpy 命名数组时，我在以下两种情况下观察到不同的行为：

案例：先使用索引数组进行高级切片，然后按名称选择子数组
案例：先按名称选择子数组，然后使用索引数组进行高级切片

以下代码举例

import numpy as np

a = np.ones(5)
data = np.array(zip(a, a, a), dtype=[("x", float), ("y", float), ("z", float)])

# case 1
# does not set elements 1, 3 and 4 of data to 22
data[[1, 3, 4]]["y"] = 22    
print data["y"]  # -> [ 1.  1.  1.  1.  1.]

# case 2
# set elements 1, 3 and 4 of data to 22
data["y"][[1, 3, 4]] = 22
print data["y"]  # -> [  1.  22.   1.  22.  22.]

两个打印命令的输出是 [ 1. 1. 1. 1. 1.] 和 [ 1. 22. 1. 22. 22.]。为什么在设置元素时更改选择的顺序会导致不同的结果？

Answer 1

使用列表或数组建立索引always returns a copy rather than a view:

In [1]: np.may_share_memory(data, data[[1, 3, 4]])
Out[1]: False

因此赋值 data[[1, 3, 4]]["y"] = 22 正在修改 data[[1, 3, 4]] 的 copy，而 data 中的原始值将不受影响。

另一方面，引用结构化数组的字段returns a view:

In [2]: np.may_share_memory(data, data["y"])
Out[2]: True

所以分配给 data["y"][[1, 3, 4]] 将影响 data.

中的相应元素

命名数组中 numpy 高级切片的意外行为

Unexpected behaviour with numpy advanced slicing in named arrays

python

numpy

structured-array