具有 CuDNNLSTM 层的 Keras 模型在生产服务器上不起作用

Question

我已经使用 AWS p3 实例使用 GPU 加速训练以下模型：

x = CuDNNLSTM(128, return_sequences=True)(inputs)
x = Dropout(0.2)(x)
x = CuDNNLSTM(128, return_sequences=False)(x)
x = Dropout(0.2)(x)
predictions = Dense(1, activation='tanh')(x)
model = Model(inputs=inputs, outputs=predictions)

训练后，我使用 Keras 的 save_model 函数保存了模型，并将其移动到没有 GPU 的独立生产服务器上。

当我尝试在生产服务器上使用模型进行预测时，它失败并出现以下错误：

No OpKernel was registered to support Op 'CudnnRNN' with these attrs. Registered devices: [CPU], Registered kernels:

我猜这是因为生产服务器不支持 GPU，但我希望这不会成为问题。有什么办法可以在没有 GPU 的生产服务器上使用这个模型吗？

Answer 1

不可以，CuDNN 需要使用 CUDA GPU。您必须将 CuDNNLSTM 层替换为标准 LSTM 层。

Answer 2

尝试

pip install tensorflow-gpu

Answer 3

您肯定可以在 CuDNNLSTM 上训练，然后运行在 LSTM 上进行推理。

诀窍是在加载 h5 文件之前，将 .json 文件中的层架构从 CuDNNLSTM 更改为 LSTM。当您加载 h5 文件时，CuDNNLSTM 权重的 2 倍偏差将自动转换，但 Keras 不会自动为您更改您的 .json 文件。

换句话说，只需打开您保存的 .json 模型，将 CuDNNLSTM 的所有实例更改为 LSTM，保存 .json 文件，然后加载您的 .h5 文件。然后，您应该能够运行使用模型进行推理。

Answer 4

简答： 是的你可以。

只需要使用 LSTM 层而不是 CuDNNLSTM 重新创建您的架构。

您的代码应如下所示：

x = LSTM(128, return_sequences=True, recurrent_activation='sigmoid')(inputs)
x = Dropout(0.2)(x)
x = LSTM(128, return_sequences=False, recurrent_activation='sigmoid')(x)
x = Dropout(0.2)(x)
predictions = Dense(1, activation='tanh')(x)
model = Model(inputs=inputs, outputs=predictions)

然后

model.load_weights(path_to_your_weights_file)

注意 recurrent_activation='sigmoid'。这很重要。

详细解释：

LSTM 和 CuDNNLSTM 彼此兼容，因此您可以毫无问题地将权重从一个加载到另一个。但是，它们的激活函数默认值略有不同。有时它会导致两者之间的差异很小，但也可能会导致很大的差异，据报道 here。

[The activation values] for CuDNNLSTM [...] are hard-coded in CuDNN and cannot be changed from Keras. They correspond to activation='tanh' and recurrent_activation='sigmoid' (slightly different than default hard_sigmoid in [LSTM] Keras). ref

Answer 5

解决这个问题的最简单方法是 用常规的 keras 层替换 CuDNN 层， IE。将 CudNNLSTM 转换为 LSTM 等

如果使用 Google Colab，请转到运行时 > 更改运行时设置并将加速器设置为 GPU

具有 CuDNNLSTM 层的 Keras 模型在生产服务器上不起作用

Keras Model With CuDNNLSTM Layers Doesn't Work on Production Server

python

keras

tensorflow

tensorflow-gpu