Keras：具有 class 权重的 LSTM

Question

我的问题与密切相关，但也超出了它。

我正在尝试在

的 Keras 中实现以下 LSTM

时间步数为 nb_tsteps=10
输入特征的数量是nb_feat=40
每个时间步的 LSTM 单元数为 120
LSTM 层之后是 TimeDistributedDense 层

根据上面提到的问题，我了解到我必须将输入数据呈现为

nb_samples, 10, 40

我通过在形状 (5932720, 40) 的原始时间序列上滚动长度 nb_tsteps=10 的 window 得到 nb_samples。因此代码是

model = Sequential()
model.add(LSTM(120, input_shape=(X_train.shape[1], X_train.shape[2]), 
  return_sequences=True, consume_less='gpu'))
model.add(TimeDistributed(Dense(50, activation='relu')))
model.add(Dropout(0.2))
model.add(TimeDistributed(Dense(20, activation='relu')))
model.add(Dropout(0.2))
model.add(TimeDistributed(Dense(10, activation='relu')))
model.add(Dropout(0.2))
model.add(TimeDistributed(Dense(3, activation='relu')))
model.add(TimeDistributed(Dense(1, activation='sigmoid')))

现在回答我的问题（假设以上是正确的）：二元响应 (0/1) 严重不平衡，我需要将 class_weight 字典传递给 cw = {0: 1, 1: 25} 到 model.fit()。但是我得到一个例外 class_weight not supported for 3+ dimensional targets。这是因为我将响应数据显示为 (nb_samples, 1, 1)。如果我将它重塑为二维数组 (nb_samples, 1) 我会得到异常 Error when checking model target: expected timedistributed_5 to have 3 dimensions, but got array with shape (5932720, 1).

非常感谢您的帮助！

Answer 1

我认为你应该使用 sample_weight 和 sample_weight_mode='temporal'。

来自 Keras 文档：

sample_weight: Numpy array of weights for the training samples, used for scaling the loss function (during training only). You can either pass a flat (1D) Numpy array with the same length as the input samples (1:1 mapping between weights and samples), or in the case of temporal data, you can pass a 2D array with shape (samples, sequence_length), to apply a different weight to every timestep of every sample. In this case you should make sure to specify sample_weight_mode="temporal" in compile().

在您的情况下，您需要提供一个与标签形状相同的二维数组。

Answer 2

如果这仍然是一个问题..我认为 TimeDistributed 层期望 returns 一个 3D 数组（有点类似于在常规 LSTM 层中是否有 return_sequences=True）。尝试在预测层之前的末尾添加一个 Flatten() 层或另一个 LSTM 层。

d = TimeDistributed(Dense(10))(input_from_previous_layer)
lstm_out = Bidirectional(LSTM(10))(d)
output = Dense(1, activation='sigmoid')(lstm_out)

Answer 3

使用 temporal 是一种解决方法。看看这个 stack. The issue is also documented on github.

Keras：具有 class 权重的 LSTM

Keras: LSTM with class weights

lstm

keras