Pytorch 到 ONNX:找不到 RandomNormalLike 的实现
Pytorch to ONNX: Could not find an implementation for RandomNormalLike
我正在尝试将一个相当复杂的模型从 pytorch 转换为 ONNX。转换成功没有错误,但是我在加载模型时遇到这个错误:
Traceback (most recent call last):
File "/home/***/***/***.py", line 50, in <module>
main()
File "/home/***/***/***.py", line 38, in main
ort_session = ort.InferenceSession(onnx_path, providers=[
File "/home/***/miniconda3/envs/***/lib/python3.9/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 324, in __init__
self._create_inference_session(providers, provider_options, disabled_optimizers)
File "/home/***/miniconda3/envs/***/lib/python3.9/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 369, in _create_inference_session
sess.initialize_session(providers, provider_options, disabled_optimizers)
onnxruntime.capi.onnxruntime_pybind11_state.NotImplemented: [ONNXRuntimeError] : 9 : NOT_IMPLEMENTED : Could not find an implementation for RandomNormalLike(1) node with name 'RandomNormalLike_598'
我认为错误所抱怨的 RandomNormalLike
节点可能对应于我拥有的这个模块:
class NoiseInjection(nn.Module):
def __init__(self):
super().__init__()
self.weight = nn.Parameter(torch.zeros(1), requires_grad=True)
def forward(
self,
feat: torch.Tensor,
noise: Optional[torch.Tensor] = None,
) -> torch.Tensor:
if noise is None:
batch, _, height, width = feat.shape
noise = torch.randn(
batch, 1, height, width,
dtype=feat.dtype,
device=feat.device,
)
return feat + self.weight * noise
我也创建了一个不同的实现,但它导致了同样的错误:(编辑:这个版本确实有效。我在别处犯了一个不相关的错误,误导我认为它不起作用)
def forward(
self,
feat: torch.Tensor,
noise: Optional[torch.Tensor] = None,
) -> torch.Tensor:
if noise is None:
noise = torch.randn_like(feat[:, 0:1])
return feat + self.weight * noise
我的pytorch和onnx版本如下:
$ conda list torch
# Name Version Build Channel
torch 1.10.0+cu113 pypi_0 pypi
torchaudio 0.10.0+cu113 pypi_0 pypi
torchvision 0.11.1+cu113 pypi_0 pypi
$ conda list onnx
# Name Version Build Channel
onnx 1.10.2 pypi_0 pypi
onnxruntime-gpu 1.9.0 pypi_0 pypi
如何才能将这样的模块导出到 onnx 并 运行 成功?
通过在线检查,我在 GitHub 上发现了一个关于 conv (https://github.com/microsoft/onnxruntime/issues/3130) 的类似问题,可能是 torch 中使用的参数类型与 RandomNormalLike 中可用的实现不兼容ONNX.
您能否在 netron 中检查 RandomNormalLike node/nodes 中的内容,看看它们是否符合规范:https://github.com/onnx/onnx/blob/main/docs/Operators.md#RandomNormal or https://github.com/onnx/onnx/blob/main/docs/Operators.md#RandomNormalLike
干杯
编辑:原来 RandomNormal 节点的类型为 10,对应于 fp16
虽然 onnx运行time 实现仅支持 float 和 double 请参阅此处的源代码:https://github.com/microsoft/onnxruntime/blob/24e35fba3217bf33b0e4064bc71d271a61938ba0/onnxruntime/core/providers/cpu/generator/random.cc#L354
这里的解决方案是 运行 fp32 中的整个模型,或者明确要求 RandomNormalLike 使用浮点数或双精度值希望 torch 允许在 fp16 上进行混合计算,fp32/fp64 我猜
对于任何试图重现这个问题的人,我举了一个最小的例子。在下面的代码中,RandLike
有效,而 RandReferenced
无效:
import torch
from torch import nn
import onnxruntime as ort
class RandLike(nn.Module):
def forward(self, x):
return torch.randn_like(x[:, 0:1])
class RandReferenced(nn.Module):
def forward(self, x):
b, _ , w, h = x.shape
return torch.randn(
b, 1, w, h,
device=x.device,
dtype=x.dtype,
)
module = RandLike().cuda().half()
dummy_input = torch.randn(2, 3, 4, 4, device='cuda').half()
torch.onnx.export(module, dummy_input, "randlike_2.onnx", input_names=["rand_input"], output_names=["rand_output"])
module = RandReferenced().cuda().half()
torch.onnx.export(module, dummy_input, "randReferenced_2.onnx", input_names=["rand_input"], output_names=["rand_output"])
ort_session = ort.InferenceSession("randlike_2.onnx", providers=[
"CUDAExecutionProvider",
])
ort_session.run(["rand_output"], {"rand_input": dummy_input.cpu().numpy()})
ort_session = ort.InferenceSession("randReferenced_2.onnx", providers=[
"CUDAExecutionProvider",
])
ort_session.run(["rand_output"], {"rand_input": dummy_input.cpu().numpy()})
运行 以上代码导致以下错误:
$ CUDA_VISIBLE_DEVICES=0 python random_like_onnx.py
Traceback (most recent call last):
File "/home/***/***/random_like_onnx.py", line 32, in <module>
ort_session = ort.InferenceSession("randReferenced_2.onnx", providers=[
File "/home/***/miniconda3/envs/***/lib/python3.9/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 324, in __init__
self._create_inference_session(providers, provider_options, disabled_optimizers)
File "/home/***/miniconda3/envs/***/lib/python3.9/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 369, in _create_inference_session
sess.initialize_session(providers, provider_options, disabled_optimizers)
onnxruntime.capi.onnxruntime_pybind11_state.NotImplemented: [ONNXRuntimeError] : 9 : NOT_IMPLEMENTED : Could not find an implementation for RandomNormalLike(1) node with name 'RandomNormalLike_1'
我正在尝试将一个相当复杂的模型从 pytorch 转换为 ONNX。转换成功没有错误,但是我在加载模型时遇到这个错误:
Traceback (most recent call last):
File "/home/***/***/***.py", line 50, in <module>
main()
File "/home/***/***/***.py", line 38, in main
ort_session = ort.InferenceSession(onnx_path, providers=[
File "/home/***/miniconda3/envs/***/lib/python3.9/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 324, in __init__
self._create_inference_session(providers, provider_options, disabled_optimizers)
File "/home/***/miniconda3/envs/***/lib/python3.9/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 369, in _create_inference_session
sess.initialize_session(providers, provider_options, disabled_optimizers)
onnxruntime.capi.onnxruntime_pybind11_state.NotImplemented: [ONNXRuntimeError] : 9 : NOT_IMPLEMENTED : Could not find an implementation for RandomNormalLike(1) node with name 'RandomNormalLike_598'
我认为错误所抱怨的 RandomNormalLike
节点可能对应于我拥有的这个模块:
class NoiseInjection(nn.Module):
def __init__(self):
super().__init__()
self.weight = nn.Parameter(torch.zeros(1), requires_grad=True)
def forward(
self,
feat: torch.Tensor,
noise: Optional[torch.Tensor] = None,
) -> torch.Tensor:
if noise is None:
batch, _, height, width = feat.shape
noise = torch.randn(
batch, 1, height, width,
dtype=feat.dtype,
device=feat.device,
)
return feat + self.weight * noise
我也创建了一个不同的实现,但它导致了同样的错误:(编辑:这个版本确实有效。我在别处犯了一个不相关的错误,误导我认为它不起作用)
def forward(
self,
feat: torch.Tensor,
noise: Optional[torch.Tensor] = None,
) -> torch.Tensor:
if noise is None:
noise = torch.randn_like(feat[:, 0:1])
return feat + self.weight * noise
我的pytorch和onnx版本如下:
$ conda list torch
# Name Version Build Channel
torch 1.10.0+cu113 pypi_0 pypi
torchaudio 0.10.0+cu113 pypi_0 pypi
torchvision 0.11.1+cu113 pypi_0 pypi
$ conda list onnx
# Name Version Build Channel
onnx 1.10.2 pypi_0 pypi
onnxruntime-gpu 1.9.0 pypi_0 pypi
如何才能将这样的模块导出到 onnx 并 运行 成功?
通过在线检查,我在 GitHub 上发现了一个关于 conv (https://github.com/microsoft/onnxruntime/issues/3130) 的类似问题,可能是 torch 中使用的参数类型与 RandomNormalLike 中可用的实现不兼容ONNX.
您能否在 netron 中检查 RandomNormalLike node/nodes 中的内容,看看它们是否符合规范:https://github.com/onnx/onnx/blob/main/docs/Operators.md#RandomNormal or https://github.com/onnx/onnx/blob/main/docs/Operators.md#RandomNormalLike
干杯
编辑:原来 RandomNormal 节点的类型为 10,对应于 fp16
虽然 onnx运行time 实现仅支持 float 和 double 请参阅此处的源代码:https://github.com/microsoft/onnxruntime/blob/24e35fba3217bf33b0e4064bc71d271a61938ba0/onnxruntime/core/providers/cpu/generator/random.cc#L354
这里的解决方案是 运行 fp32 中的整个模型,或者明确要求 RandomNormalLike 使用浮点数或双精度值希望 torch 允许在 fp16 上进行混合计算,fp32/fp64 我猜
对于任何试图重现这个问题的人,我举了一个最小的例子。在下面的代码中,RandLike
有效,而 RandReferenced
无效:
import torch
from torch import nn
import onnxruntime as ort
class RandLike(nn.Module):
def forward(self, x):
return torch.randn_like(x[:, 0:1])
class RandReferenced(nn.Module):
def forward(self, x):
b, _ , w, h = x.shape
return torch.randn(
b, 1, w, h,
device=x.device,
dtype=x.dtype,
)
module = RandLike().cuda().half()
dummy_input = torch.randn(2, 3, 4, 4, device='cuda').half()
torch.onnx.export(module, dummy_input, "randlike_2.onnx", input_names=["rand_input"], output_names=["rand_output"])
module = RandReferenced().cuda().half()
torch.onnx.export(module, dummy_input, "randReferenced_2.onnx", input_names=["rand_input"], output_names=["rand_output"])
ort_session = ort.InferenceSession("randlike_2.onnx", providers=[
"CUDAExecutionProvider",
])
ort_session.run(["rand_output"], {"rand_input": dummy_input.cpu().numpy()})
ort_session = ort.InferenceSession("randReferenced_2.onnx", providers=[
"CUDAExecutionProvider",
])
ort_session.run(["rand_output"], {"rand_input": dummy_input.cpu().numpy()})
运行 以上代码导致以下错误:
$ CUDA_VISIBLE_DEVICES=0 python random_like_onnx.py
Traceback (most recent call last):
File "/home/***/***/random_like_onnx.py", line 32, in <module>
ort_session = ort.InferenceSession("randReferenced_2.onnx", providers=[
File "/home/***/miniconda3/envs/***/lib/python3.9/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 324, in __init__
self._create_inference_session(providers, provider_options, disabled_optimizers)
File "/home/***/miniconda3/envs/***/lib/python3.9/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 369, in _create_inference_session
sess.initialize_session(providers, provider_options, disabled_optimizers)
onnxruntime.capi.onnxruntime_pybind11_state.NotImplemented: [ONNXRuntimeError] : 9 : NOT_IMPLEMENTED : Could not find an implementation for RandomNormalLike(1) node with name 'RandomNormalLike_1'