Python TensorRT：CUDNN_STATUS_MAPPING_ERROR 错误

Question

当运行使用TensorRT的PythonAPI（以及PyCUDA）的面部识别算法时，我遇到以下错误：

[TensorRT] ERROR: ../rtSafe/safeContext.cpp (133) - Cudnn Error in configure: 7 (CUDNN_STATUS_MAPPING_ERROR)

[TensorRT] ERROR: FAILED_EXECUTION: std::exception

代码仍然可以编译和运行，但结果不准确 -- 当期望更连续的数字范围时，程序的输出从 0.999999907141816 波动到 0。我已经用 TF-TRT 和 Keras 对此进行了测试，我的代码在两者中都有效（进行了一些小改动以适应 TF 和 Keras APIs 之间的差异）。

我试过安装不同版本的 CUDA（9.0、10.0 和 10.1）和 CuDNN（7.6.3、7.6.5）。 TensorRT版本为6.0.1.5，PyCUDA为2019.1.2。如果有帮助，我会在 Ubuntu 18.04.

运行

如有任何帮助，我们将不胜感激！

更新：我认为错误是由同时运行个 TensorFlow 会话引起的。具体来说，我正在使用可能会干扰 TensorRT 的 mtcnn 包 (link)。 mtcnn初始化TF会话时，出现上述错误；当不使用 mtcnn 时，不会发生此错误，一切都按预期运行。

Answer 1

已修复——该错误似乎是由 TensorFlow 和 TensorRT 之间的 GPU 内存冲突引起的。因为我在同一个程序中同时使用两者，所以我假设它们处理 GPU 分配的方式存在冲突。解决方案是在使用 Pycuda 和 TensorRT 分配缓冲区和创建异步流之前使用 allow_growth=True 进入 TensorFlow 会话。

进入 TensorFlow 会话（必须在第 2 步之前发生）：

tf.Session(config=tf.ConfigProto(gpu_options=tf.GPUOptions(allow_growth=True))).__enter__()

分配缓冲区（第 1 步之后，来自 https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html#serial_model_python）：

h_input = cuda.pagelocked_empty(trt.volume(engine.get_binding_shape(0)), dtype=np.float32)
h_output = cuda.pagelocked_empty(trt.volume(engine.get_binding_shape(1)), dtype=np.float32)
# Allocate device memory for inputs and outputs.
d_input = cuda.mem_alloc(h_input.nbytes)
d_output = cuda.mem_alloc(h_output.nbytes)
# Create a stream in which to copy inputs/outputs and run inference.
stream = cuda.Stream()

推理（见https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html#perform_inference_python）

Python TensorRT：CUDNN_STATUS_MAPPING_ERROR 错误

Python TensorRT: CUDNN_STATUS_MAPPING_ERROR Error

python

tensorrt