Keras:具有 class 权重的 LSTM
Keras: LSTM with class weights
我的问题与 密切相关,但也超出了它。
我正在尝试在
的 Keras 中实现以下 LSTM
- 时间步数为
nb_tsteps=10
- 输入特征的数量是
nb_feat=40
- 每个时间步的 LSTM 单元数为
120
- LSTM 层之后是 TimeDistributedDense 层
根据上面提到的问题,我了解到我必须将输入数据呈现为
nb_samples, 10, 40
我通过在形状 (5932720, 40)
的原始时间序列上滚动长度 nb_tsteps=10
的 window 得到 nb_samples
。因此代码是
model = Sequential()
model.add(LSTM(120, input_shape=(X_train.shape[1], X_train.shape[2]),
return_sequences=True, consume_less='gpu'))
model.add(TimeDistributed(Dense(50, activation='relu')))
model.add(Dropout(0.2))
model.add(TimeDistributed(Dense(20, activation='relu')))
model.add(Dropout(0.2))
model.add(TimeDistributed(Dense(10, activation='relu')))
model.add(Dropout(0.2))
model.add(TimeDistributed(Dense(3, activation='relu')))
model.add(TimeDistributed(Dense(1, activation='sigmoid')))
现在回答我的问题(假设以上是正确的):
二元响应 (0/1) 严重不平衡,我需要将 class_weight
字典传递给 cw = {0: 1, 1: 25}
到 model.fit()
。但是我得到一个例外 class_weight not supported for 3+ dimensional targets
。这是因为我将响应数据显示为 (nb_samples, 1, 1)
。如果我将它重塑为二维数组 (nb_samples, 1)
我会得到异常 Error when checking model target: expected timedistributed_5 to have 3 dimensions, but got array with shape (5932720, 1)
.
非常感谢您的帮助!
我认为你应该使用 sample_weight
和 sample_weight_mode='temporal'
。
来自 Keras 文档:
sample_weight: Numpy array of weights for the training samples, used
for scaling the loss function (during training only). You can either
pass a flat (1D) Numpy array with the same length as the input samples
(1:1 mapping between weights and samples), or in the case of temporal
data, you can pass a 2D array with shape (samples, sequence_length),
to apply a different weight to every timestep of every sample. In this
case you should make sure to specify sample_weight_mode="temporal" in
compile().
在您的情况下,您需要提供一个与标签形状相同的二维数组。
如果这仍然是一个问题..我认为 TimeDistributed 层期望 returns 一个 3D 数组(有点类似于在常规 LSTM 层中是否有 return_sequences=True)。尝试在预测层之前的末尾添加一个 Flatten() 层或另一个 LSTM 层。
d = TimeDistributed(Dense(10))(input_from_previous_layer)
lstm_out = Bidirectional(LSTM(10))(d)
output = Dense(1, activation='sigmoid')(lstm_out)
使用 temporal
是一种解决方法。看看这个 stack. The issue is also documented on github.
我的问题与
我正在尝试在
的 Keras 中实现以下 LSTM- 时间步数为
nb_tsteps=10
- 输入特征的数量是
nb_feat=40
- 每个时间步的 LSTM 单元数为
120
- LSTM 层之后是 TimeDistributedDense 层
根据上面提到的问题,我了解到我必须将输入数据呈现为
nb_samples, 10, 40
我通过在形状 (5932720, 40)
的原始时间序列上滚动长度 nb_tsteps=10
的 window 得到 nb_samples
。因此代码是
model = Sequential()
model.add(LSTM(120, input_shape=(X_train.shape[1], X_train.shape[2]),
return_sequences=True, consume_less='gpu'))
model.add(TimeDistributed(Dense(50, activation='relu')))
model.add(Dropout(0.2))
model.add(TimeDistributed(Dense(20, activation='relu')))
model.add(Dropout(0.2))
model.add(TimeDistributed(Dense(10, activation='relu')))
model.add(Dropout(0.2))
model.add(TimeDistributed(Dense(3, activation='relu')))
model.add(TimeDistributed(Dense(1, activation='sigmoid')))
现在回答我的问题(假设以上是正确的):
二元响应 (0/1) 严重不平衡,我需要将 class_weight
字典传递给 cw = {0: 1, 1: 25}
到 model.fit()
。但是我得到一个例外 class_weight not supported for 3+ dimensional targets
。这是因为我将响应数据显示为 (nb_samples, 1, 1)
。如果我将它重塑为二维数组 (nb_samples, 1)
我会得到异常 Error when checking model target: expected timedistributed_5 to have 3 dimensions, but got array with shape (5932720, 1)
.
非常感谢您的帮助!
我认为你应该使用 sample_weight
和 sample_weight_mode='temporal'
。
来自 Keras 文档:
sample_weight: Numpy array of weights for the training samples, used for scaling the loss function (during training only). You can either pass a flat (1D) Numpy array with the same length as the input samples (1:1 mapping between weights and samples), or in the case of temporal data, you can pass a 2D array with shape (samples, sequence_length), to apply a different weight to every timestep of every sample. In this case you should make sure to specify sample_weight_mode="temporal" in compile().
在您的情况下,您需要提供一个与标签形状相同的二维数组。
如果这仍然是一个问题..我认为 TimeDistributed 层期望 returns 一个 3D 数组(有点类似于在常规 LSTM 层中是否有 return_sequences=True)。尝试在预测层之前的末尾添加一个 Flatten() 层或另一个 LSTM 层。
d = TimeDistributed(Dense(10))(input_from_previous_layer)
lstm_out = Bidirectional(LSTM(10))(d)
output = Dense(1, activation='sigmoid')(lstm_out)
使用 temporal
是一种解决方法。看看这个 stack. The issue is also documented on github.