Tensorflow.js：op 最大值的梯度错误。输入“$a”的梯度形状为“32,200”，与输入“32,1”的形状不匹配

Question

我构建了一个非常简单的 Tensorflow 操作，一切似乎都有意义，但是当我调用 fit 函数时，模型无法反向传播梯度并出现上述错误消息：

Error in gradient for op maximum. 

The gradient of input '$a' has shape '32,200', 
which does not match the shape of the input '32,1'

下面是xTrain和yTrain

的类型

xTrain
  Array(3) [2000, 20, 73]
  float32
yTrain
  Array(2) [2000, 200]
  float32

以下是模型的预期输入和输出：

model.input
  Array(3) [null, 20, 73]
  float32
model.outputs[0]
  Array(2) [null, 200]
  float32

[编辑] 我应该注意到我的问题只在我尝试使用

时发生

loss: 'cosineProximity'

这是我的代码：

console.log("starting compute_and_save_model");

const model = tf.sequential();
model.add(tf.layers.simpleRNN({
    units: length_of_embedding,//amount_of_rnn_units,
    recurrentInitializer: 'glorotNormal',
    inputShape: [max_len, recogized_letters.length],
    return_sequences: false,
}));

console.log(model.input.shape);
console.log(model.input.dtype);
console.log(model.outputs[0].shape);
console.log(model.outputs[0].dtype);
console.log(model.batchInputShape);

model.compile({
    loss: 'cosineProximity',
    optimizer: 'adam',
    metrics: ['acc']
});

console.log("starting compute_and_save_model (fit)")

await model.fit(xTrain, yTrain, {
    epochs: 2,
    batchSize: 32,
    validationSplit: 0.2,
    callbacks: {
        onBatchBegin(b) {
            console.log("starting compute_and_save_model (fit:"+b+")");
        }
    }
});

Runnable from https://stackblitz.com/edit/js-ddlwge

有没有人知道这里可能出了什么问题？

EDIT: I tried to create my own cosineProximity implementation and get the same error. For reference here was my implementation of the cosineProximity:
const cosine = tf.layers.dot({axes: -1,normalize:true})
loss: function(a,b) {
    return tf.neg(tf.mean(cosine.apply([a,b])));
},

Answer 1

好吧，我在这上面花了一些时间，看起来这是 Tensforflow.js 实现中的一个错误。

如果您遇到同样的问题，您可以通过自己应用以下补丁来修复它（我相信 tfjs-layers 维护者最终会合并这个 pull request，所以希望您不会再遇到这个问题未来）。

https://github.com/tensorflow/tfjs-layers/pull/499

| export function l2Normalize(x: Tensor, axis?: number): Tensor {
|   return tidy(() => {
|     const squareSum = tfc.sum(K.square(x), axis, true);
-     const epsilonTensor = tfc.mul(scalar(epsilon()), tfc.onesLike(x));
+     const epsilonTensor = tfc.mul(scalar(epsilon()), tfc.onesLike(squareSum));
|     const norm = tfc.sqrt(tfc.maximum(squareSum, epsilonTensor));
|     return tfc.div(x, norm);
|   });
| }

Tensorflow.js：op 最大值的梯度错误。输入“$a”的梯度形状为“32,200”，与输入“32,1”的形状不匹配

Tensorflow.js: Error in gradient for op maximum. The gradient of input '$a' has shape '32,200', which does not match the shape of the input '32,1'

tensorflow

tensorflow.js