在 Keras 的多对一 RNN 中使用时间 class 权重

Question

我有一个模型，它使用多个多对多 RNN 层，然后是一个多对一层和一个一对一解码层。

所以第一个RNN层使用参数"return_sequences=True"，第二个RNN层使用"return_sequences=False"。

到目前为止这是有效的，但是当我添加样本权重时它会崩溃。如果我有一个常规的多对多网络，我只需在 compile() 函数中设置 sample_weight_mode="temporal" 并将权重定义为二维矩阵。

然而，在我的多对一情况下，这不起作用，因为我得到了时间权重期望输出中的时间维度的错误。我意识到这可能是因为我的解码层不再是临时的（多对一）。但是我不能运行使用非时间权重的网络，因为这不能与多对多层一起工作。

对于混合多对多和多对一层的样本权重，是否有解决方案？

这是我的模型，希望能澄清一点：

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
the_input (InputLayer)       (None, 20, 249)           0         
_________________________________________________________________
masking_1 (Masking)          (None, 20, 249)           0         
_________________________________________________________________
time_distributed_1 (TimeDist (None, 20, 64)            16000     
_________________________________________________________________
relu (Activation)            (None, 20, 64)            0         
_________________________________________________________________
simple_rnn_1 (SimpleRNN)     (None, 20, 16)            1296      
_________________________________________________________________
simple_rnn_2 (SimpleRNN)     (None, 16)                528       
_________________________________________________________________
dense_2 (Dense)              (None, 36)                612       
_________________________________________________________________
softmax (Activation)         (None, 36)                0         
=================================================================

Answer 1

TL;DR - 只需将 sample_weight_mode 设置为 None

But I can't run the network using non-temporal weights as this would not work together with the many-to-many layers.

这句话让我觉得你实际上并没有理解样本加权应该做什么。来自 keras' docs:

sample_weight: Optional Numpy array of weights for the training samples, used for weighting the loss function

这意味着您的样本权重将仅用于损失函数（并且，在更新的版本中，也用于加权指标）。由于你的损失函数是你输出的单独函数，我看不出你的时间相关中间层会影响任何东西的原因。

如果每个时间序列只有一个输出，那么您的样本就是整个时间序列，而不是每个时间步长。因此，使您的样本权重与您将用于 MLP 的样本权重没有什么不同，例如

在 Keras 的多对一 RNN 中使用时间 class 权重

Use temporal class weights in many-to-one RNNs in Keras

python

tensorflow

keras

rnn