IndexError: index 6842 is out of bounds for axis 0 with size 6842

Question

我知道这可能是一个非常愚蠢的错误，但我坚持这样做。我需要使用 numpy:

对数组进行 1-hot 编码

numpy.版本 ==> 1.18.5

print(array)
[[   3 1275   10 ...    1 2235    1]
 [   0    0    0 ...    2  139  151]
 [1277 1278    1 ... 2239  831    1]
 ...
 [   2 6833   28 ...   25  520    1]
 [   0    0    0 ...    4  481    1]
 [   0    0    0 ... 6842 6843    1]]


print(array.shape)
# (1250, 20)    

print(array_classes)
# 6842

当我尝试创建数组的单热编码时：

ohe = np.eye(array_classes)[array]

我收到这个错误：

IndexError                                Traceback (most recent call last)
<ipython-input-32-3a05df861550> in <module>()
      1 print(array_classes)
      2 print(array)
----> 3 ohe = np.eye(array_classes)[array]

IndexError: index 6842 is out of bounds for axis 0 with size 6842

编辑：

这是我的错误和预期输出的更简单示例：

这是我的初始数组：（形状 (2, 20)，2 个示例，每个 20 个值）

array = array[:2]
print(array)
[[   3 1275   10 2231  830    1 2232    1 2233    4  220    1  339    1
  2234   15  477    1 2235    1]
 [   0    0    0    0    0    0    0    0    0    0    0 1276    1    2
  2236 2237   30    2  139  151]]

这里是总人数类:

 print(array_classes)
 # 6842

我用类:

创建单位矩阵

 matrix = np.eye(array_classes)
 print(matrix, matrix.shape)
[[1. 0. 0. ... 0. 0. 0.]
 [0. 1. 0. ... 0. 0. 0.]
 [0. 0. 1. ... 0. 0. 0.]
 ...
 [0. 0. 0. ... 1. 0. 0.]
 [0. 0. 0. ... 0. 1. 0.]
 [0. 0. 0. ... 0. 0. 1.]] (6843, 6843)

然后我将初始数组乘以单位矩阵，得到数组的 1-hot 编码版本：

ohe = matrix[array]
print(ohe, ohe.shape)

[[[0. 0. 0. ... 0. 0. 0.]
  [0. 0. 0. ... 0. 0. 0.]
  [0. 0. 0. ... 0. 0. 0.]
  ...
  [0. 1. 0. ... 0. 0. 0.]
  [0. 0. 0. ... 0. 0. 0.]
  [0. 1. 0. ... 0. 0. 0.]]

 [[1. 0. 0. ... 0. 0. 0.]
  [1. 0. 0. ... 0. 0. 0.]
  [1. 0. 0. ... 0. 0. 0.]
  ...
  [0. 0. 1. ... 0. 0. 0.]
  [0. 0. 0. ... 0. 0. 0.]
  [0. 0. 0. ... 0. 0. 0.]]] (2, 20, 6843)

Answer 1

您通过 array 使用高级索引。 array 中的值超出了由 np.eye

创建的数组的范围

print(np.eye(array_classes).shape)
# (6842, 6842)

即，np.eye 的结果是具有 6842 rows 和 6842 columns 的二维数组。它的索引来自0 - 6841

现在，您可以使用 array

对其进行高级索引

np.eye(array_classes)[array]

numpy 使用 array 的每个值来索引 np.eye 输出。您的 array 有几个大于 6841 的值，例如 6842, 6843。因此，当它命中第一个 6842 时，它会出错。

Answer 2

对我来说这是解决方案：

np.eye(array_classes + 1)[array]

因为我需要数组的最后一个索引是类的数字所以我必须将 +1 添加到 np.eye()

IndexError: index 6842 is out of bounds for axis 0 with size 6842

IndexError: index 6842 is out of bounds for axis 0 with size 6842

python

arrays

encoding

numpy