Numpy ndArray：访问每个 class 的输入特征

Question

对于我当前的 classification 任务，我很感兴趣的是访问个人 class 的输入特征，这样每个 class 仅在其输入特征上进行训练（弱 classifier），稍后用于它们的合奏。

我在访问这些功能时遇到了挑战。承认，我总是对多维数组感到困惑。我举例说明了我如何尝试访问以下 MWE 中的 class 功能。

import keras
import numpy as np
from sklearn.model_selection import train_test_split

Data = np.random.randn(20, 1, 5, 4)
x,y,z = np.repeat(0, 7), np.repeat(1, 7), np.repeat(2, 6)
labels = np.hstack((x,y,z))

LABELS= list(set(np.ndarray.flatten(labels)))
Class_num = len(LABELS)

trainX, testX, trainY, testY = train_test_split(Data, 
                      labels, test_size=0.20, random_state=42)

#...to categorical
trainY = keras.utils.to_categorical(trainY, num_classes=Class_num)
testY = keras.utils.to_categorical(testY, num_classes=Class_num)

ensemble = []
for i in range(len(LABELS)):
    print('Train on class ' ,LABELS[i])
    sub_train = trainX[trainY == i]
    sub_test = testX[testY == i]

    #model fit follows...

错误：

Train on class  0

---------------------------------------------------------------------------

IndexError                                Traceback (most recent call last)

<ipython-input-11-52ceeb9a1011> in <module>()
     20 for i in range(len(LABELS)):
     21     print('Train on class ' ,LABELS[i])
---> 22     sub_train = trainX[trainY == i]
     23     sub_test = testX[testY == i]
     24 

IndexError: boolean index did not match indexed array along dimension 1; dimension is 1 but corresponding boolean dimension is 3

显然，我做的数组索引有误。注意 trainX/testX.

的形状

Answer 1

使用argmax(axis=1).

在您的代码中，您在 trainY 上调用函数 to_categorical。这为您提供了一个形状为 (16, 3) 的数组，其中 3 是 classes:

的数量

[[0. 1. 0.]
 [1. 0. 0.]
 [0. 1. 0.]
 [1. 0. 0.]
 [0. 0. 1.]

使用 argmax(axis=1) 为您提供此转换后的 class id：[1 0 1 0 2 2 1 0 1 2 0 1 1 1 2 0].

您在这里需要做的就是将第 22 和 23 行更改为：

    sub_train = trainX[trainY.argmax(axis=1) == i]
    sub_test = testX[testY.argmax(axis=1) == i]

Numpy ndArray：访问每个 class 的输入特征

Numpy ndArray: accessing input features of each class

python

numpy

multidimensional-array

numpy-ndarray

array-indexing