推送/弹出 pycuda 上下文时出现 CuPy 错误

Question

我正在使用 tensorRT 通过 CUDA 执行推理。我想使用 CuPy 预处理一些我将提供给 tensorRT 引擎的图像。只要 tensorRT 在 my_function 方法的不同调用之间不是运行，称为 my_function 的预处理函数就可以正常工作（请参见下面的代码）。具体来说，这个问题与 tensorRT 并不严格相关，而是因为 tensorRT 推理需要被 pycuda 上下文的 push 和 pop 操作包装。

关于以下代码，my_function 的最后一次执行将引发以下错误：

  File "/home/ubuntu/myfile.py", line 188, in _pre_process_cuda
    img = ndimage.zoom(img, scaling_factor)
  File "/home/ubuntu/.local/lib/python3.6/site-packages/cupyx/scipy/ndimage/interpolation.py", line 482, in zoom
    kern(input, zoom, output)
  File "cupy/core/_kernel.pyx", line 822, in cupy.core._kernel.ElementwiseKernel.__call__
  File "cupy/cuda/function.pyx", line 196, in cupy.cuda.function.Function.linear_launch
  File "cupy/cuda/function.pyx", line 164, in cupy.cuda.function._launch
  File "cupy_backends/cuda/api/driver.pyx", line 299, in cupy_backends.cuda.api.driver.launchKernel
  File "cupy_backends/cuda/api/driver.pyx", line 124, in cupy_backends.cuda.api.driver.check_status
cupy_backends.cuda.api.driver.CUDADriverError: CUDA_ERROR_INVALID_HANDLE: invalid resource handle

注意：在下面的代码中我没有报告整个tensorRT推理代码。事实上，简单地推入和弹出 pycuda 上下文 会产生错误

代码：

import numpy as np
import cv2
import time
from PIL import Image
import requests
from io import BytesIO
from matplotlib import pyplot as plt
import cupy as cp
from cupyx.scipy import ndimage
import tensorrt as trt
import pycuda.driver as cuda
import pycuda.autoinit


def my_function(numpy_frame):
    dtype = 'float32'
    img = cp.array(numpy_frame, dtype='float32')
    # print(img)
    img = ndimage.zoom(img, (0.5, 0.5, 3))
    img = (cp.array(2, dtype=dtype) / cp.array(255, dtype=dtype)) * img - cp.array(1, dtype=dtype)
    img = img.transpose((2, 0, 1))
    img = img.ravel()
    return img


# load image
url = "https://www.pexels.com/photo/109919/download/?search_query=&tracking_id=411xe21veam"
response = requests.get(url)
img = Image.open(BytesIO(response.content))
img = np.array(img)

# initialize tensorrt
TRT_LOGGER = trt.Logger(trt.Logger.WARNING)
trt_runtime = trt.Runtime(TRT_LOGGER)
cfx = cuda.Device(0).make_context()


my_function(img)  # ok
my_function(img)  # ok

# ----- TENSORRT ---------
cfx.push()
# .... tensorrt inference....
cfx.pop()
# ----- TENSORRT ---------

my_function(img)  # <---- error

我什至尝试过其他方法，但不幸的是结果相同：

cfx.push()
my_function(img)  # ok
cfx.pop()

cfx.push()
my_function(img)  # error
cfx.pop()

@admin：如果你能为这个问题想出一个更好的名字，请随意编辑它:)

Answer 1

打开了多个上下文。例如，似乎以下所有内容都打开了一个上下文：

import pycuda.autoinit
cfx.cuda.Device(0).make_context()
cfx.push()

因此，如果您运行以上三个命令，那么仅运行一个 cfx.pop() 是不够的。您需要运行 cfx.pop() 三次才能弹出所有上下文。

推送/弹出 pycuda 上下文时出现 CuPy 错误

CuPy error when pushing / popping pycuda context

pycuda

cupy