为什么 Keras 将输入形状从 (3,3) 转换为 (?,3,3)？

Question

我目前正在尝试让自定义 keras 图层工作，您可以在此处查看简化版本：

class MyLayer(Layer):

    def __init__(self, **kwargs):
        super(MyLayer, self).__init__(**kwargs)

    def build(self, input_shape):
        print("input_shape: "+str(input_shape))
        self.kernel = self.add_weight(name='kernel',
                                      shape=(input_shape[1], self.output_dim),
                                      initializer='uniform',
                                      trainable=True)
        super(MyLayer, self).build(input_shape)

    def call(self, x):
        print("input tensor: "+str(x))
        return K.dot(x, self.kernel)


inputs = Input(shape=(3,3), dtype='float', name='inputs')
results = MyLayer(input_shape=(3,3))(inputs)

最终的控制台输出是这样的：

input_shape: (None, 3, 3)
input tensor: Tensor("inputs:0", shape=(?, 3, 3), dtype=float32)

可以看到，图层得到的input_shape并不是我指定的(3,3)，而是(None,3,3)。这是为什么？输入张量的形状也是形状（?, 3,3），我认为这是奇怪的 input_shape（None, 3,3).但是，如果将 super(MyLayer, self).build(input_shape) 替换为 super(MyLayer, self).build((3,3))，则输入张量也具有三维形状。这个神秘的三维 keras 自动添加的是什么，为什么要这样做？

Answer 1

没什么神秘的，就是batch维度，因为keras（和大多数DL框架），一次对batches数据进行计算，因为这增加了并行度，它直接映射到batches在随机梯度下降中。

您的图层需要支持批量计算，因此批量维度始终存在于输入和输出数据中，并由 keras 自动添加到 input_shape。

Answer 2

这个新添加的维度是指批量维度，即在您的情况下，您将传递 3x3 维张量的批次。这个额外的 None 维度指的是批次维度，在创建图表期间是未知的。

如果您查看 Keras 核心层网页中的输入层说明， https://keras.io/layers/core/，你会看到你在创建输入层时传递的形状参数定义如下：

shape: A shape tuple (integer), not including the batch size. For instance, shape=(32,) indicates that the expected input will be batches of 32-dimensional vectors.

Answer 3

如果你想指定批量大小，你可以这样做：

输入=输入(batch_shape=(batch_size,高度,宽度,通道))

为什么 Keras 将输入形状从 (3,3) 转换为 (?,3,3)？

Why does Keras convert the input shape from (3,3) to (?,3,3)?

python

dimensions

keras

tensorflow