我最后一个密集的 keras 层有什么问题？

Question

我正在 keras 中针对多 class class 化问题开发一个小型神经网络。我有9个不同的标签，我的特征也是9个。

我的 train/test 形状如下：

Sets shape:
x_train shape: (7079, 9)
y_train shape: (7079,)
x_test shape: (7079, 9)
y_test shape: (7079,)

但是当我尝试将它们分类时：

y_train = tf.keras.utils.to_categorical(y_train, num_classes=9)
y_test = tf.keras.utils.to_categorical(y_test, num_classes=9)

我收到以下错误：

IndexError: index 9 is out of bounds for axis 1 with size 9

这里是关于 y_train

的更多信息

print(np.unique(y_train)) # [1. 2. 3. 4. 5. 6. 7. 8. 9.]
print(len(np.unique(y_train))) # 9

有人知道问题出在哪里吗？

Answer 1

y_train 的形状是 1D。您必须对其进行 one-hot 编码。像

y_train = tf.keras.utils.to_categorical(y_train , num_classes=9)

y_test 也是如此。

更新

根据 doc,

tf.keras.utils.to_categorical(y, num_classes=None, dtype="float32")

这里，y：class向量要转换成矩阵（整数从0到num_classes）。对于您的情况，y_train 类似于 [1,2,..]。您需要进行如下操作：

y_train = tf.keras.utils.to_categorical(y_train - 1, num_classes=9)

这里有一个例子供参考。如果我们这样做

class_vector = np.array([1, 1, 2, 3, 5, 1, 4, 2])
print(class_vector)

output_matrix = tf.keras.utils.to_categorical(class_vector, 
                                      num_classes = 5, dtype ="float32")
print(output_matrix)

[1 1 2 3 5 1 4 2]
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-15-69c8be7a0f1a> in <module>()
      6 print(class_vector)
      7 
----> 8 output_matrix = tf.keras.utils.to_categorical(class_vector, num_classes = 5, dtype ="float32")
      9 print(output_matrix)

/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/utils/np_utils.py in to_categorical(y, num_classes, dtype)
     76   n = y.shape[0]
     77   categorical = np.zeros((n, num_classes), dtype=dtype)
---> 78   categorical[np.arange(n), y] = 1
     79   output_shape = input_shape + (num_classes,)
     80   categorical = np.reshape(categorical, output_shape)

IndexError: index 5 is out of bounds for axis 1 with size 5

为了解决这个问题，我们将数据转换为从零开始的格式。

output_matrix = tf.keras.utils.to_categorical(class_vector - 1, 
                                     num_classes = 5, dtype ="float32")
print(output_matrix)

[[1. 0. 0. 0. 0.]
 [1. 0. 0. 0. 0.]
 [0. 1. 0. 0. 0.]
 [0. 0. 1. 0. 0.]
 [0. 0. 0. 0. 1.]
 [1. 0. 0. 0. 0.]
 [0. 0. 0. 1. 0.]
 [0. 1. 0. 0. 0.]]

我最后一个密集的 keras 层有什么问题？

What is the issue with my last dense keras layer?

python-3.x

keras

tensorflow

multiclass-classification

更新