在 cpu 上使用带有 nvidia tensorrt 的 .h5 模型而不是 gpu
Use .h5 model with nvidia tensorrt on cpu instead of gpu
我有一个 .h5 模型(用于 GPU?),我想 运行 在我的 CPU 上。我使用 python 转换了模型,看起来它确实被转换了,但是当 运行 在 docker tensorrt 中使用它时,我得到错误:
[[TRTEngineOp_8]]
E0106 21:02:54.141211 1 model_repository_manager.cc:810] failed to load 'retinanet_TRT' version 1: Internal: No OpKernel was registered to support Op 'TRTEngineOp' used by {{node TRTEngineOp_16}}with these attrs: [use_calibration=false, fixed_input_size=true, input_shapes=[[?,?,?,3]], OutT=[DT_FLOAT], precision_mode="FP16", static_engine=false, serialized_segment="\ne\n1T...2[=13=]5VALID", cached_engine_batches=[], InT=[DT_FLOAT], calibration_data="", output_shapes=[[?,?,?,64]], workspace_size_bytes=2127659, max_cached_engines_count=1, segment_funcdef_name="TRTEngineOp_16_native_segment"]
Registered devices: [CPU, XLA_CPU]
Registered kernels:
device='GPU'
我该怎么做才能转换模型,以便我只能将其与 CPU 一起使用?
是这样转换的:
with tf.Graph().as_default():
with tf.Session() as sess:
graph = sess.graph
K.set_session(sess)
K.set_learning_phase(0)
inference_model = create_model(num_classes=num_classes)
load_model()
# Find output nodes
outputs, output_node_list = get_nodes_from_model(inference_model.outputs)
# find input nodes
inputs, input_node_list = get_nodes_from_model(inference_model.inputs)
generate_config()
with sess.as_default():
freeze_var_names = list(set(v.op.name for v in tf.global_variables()).difference(None or []))
output_names = output_node_list or []
output_names += [v.op.name for v in tf.global_variables()]
input_graph_def = graph.as_graph_def()
for node in input_graph_def.node:
# print(node.name)
node.device = ""
frozen_graph = tf.compat.v1.graph_util.convert_variables_to_constants(
sess, input_graph_def, output_names, freeze_var_names)
trt_graph = trt.create_inference_graph(
# frozen model
input_graph_def=frozen_graph,
outputs=output_node_list,
# specify the max workspace
max_workspace_size_bytes=500000000,
# precision, can be "FP32" (32 floating point precision) or "FP16"
precision_mode=precision,
is_dynamic_op=True)
# Finally we serialize and dump the output graph to the filesystem
with tf.gfile.GFile(model_save_path, 'wb') as f:
f.write(trt_graph.SerializeToString())
print("TensorRT model is successfully stored! \n")
is_dynamic_op=True
已经帮助转换模型(现在说它已成功存储),但我仍然无法在 docker TensorRT 服务器中加载它。
我正在使用 nvcr.io/nvidia/tensorflow:19.10-py3 容器转换模型和 nvcr.io/nvidia/tensorrtserver:19.10-py3 容器用于 TensorRT 服务器。
只是不要将您的模型转换为 TensorRT。
with tf.Graph().as_default():
with tf.Session() as sess:
graph = sess.graph
K.set_session(sess)
K.set_learning_phase(0)
inference_model = create_model(num_classes=num_classes)
load_model()
# Find output nodes
outputs, output_node_list = get_nodes_from_model(inference_model.outputs)
# find input nodes
inputs, input_node_list = get_nodes_from_model(inference_model.inputs)
generate_config()
with sess.as_default():
freeze_var_names = list(set(v.op.name for v in tf.global_variables()).difference(None or []))
output_names = output_node_list or []
output_names += [v.op.name for v in tf.global_variables()]
input_graph_def = graph.as_graph_def()
for node in input_graph_def.node:
# print(node.name)
node.device = ""
frozen_graph = tf.compat.v1.graph_util.convert_variables_to_constants(
sess, input_graph_def, output_names, freeze_var_names)
# Finally we serialize and dump the output graph to the filesystem
with tf.gfile.GFile(model_save_path, 'wb') as f:
f.write(frozen_graph.SerializeToString())
我有一个 .h5 模型(用于 GPU?),我想 运行 在我的 CPU 上。我使用 python 转换了模型,看起来它确实被转换了,但是当 运行 在 docker tensorrt 中使用它时,我得到错误:
[[TRTEngineOp_8]] E0106 21:02:54.141211 1 model_repository_manager.cc:810] failed to load 'retinanet_TRT' version 1: Internal: No OpKernel was registered to support Op 'TRTEngineOp' used by {{node TRTEngineOp_16}}with these attrs: [use_calibration=false, fixed_input_size=true, input_shapes=[[?,?,?,3]], OutT=[DT_FLOAT], precision_mode="FP16", static_engine=false, serialized_segment="\ne\n1T...2[=13=]5VALID", cached_engine_batches=[], InT=[DT_FLOAT], calibration_data="", output_shapes=[[?,?,?,64]], workspace_size_bytes=2127659, max_cached_engines_count=1, segment_funcdef_name="TRTEngineOp_16_native_segment"] Registered devices: [CPU, XLA_CPU] Registered kernels: device='GPU'
我该怎么做才能转换模型,以便我只能将其与 CPU 一起使用?
是这样转换的:
with tf.Graph().as_default():
with tf.Session() as sess:
graph = sess.graph
K.set_session(sess)
K.set_learning_phase(0)
inference_model = create_model(num_classes=num_classes)
load_model()
# Find output nodes
outputs, output_node_list = get_nodes_from_model(inference_model.outputs)
# find input nodes
inputs, input_node_list = get_nodes_from_model(inference_model.inputs)
generate_config()
with sess.as_default():
freeze_var_names = list(set(v.op.name for v in tf.global_variables()).difference(None or []))
output_names = output_node_list or []
output_names += [v.op.name for v in tf.global_variables()]
input_graph_def = graph.as_graph_def()
for node in input_graph_def.node:
# print(node.name)
node.device = ""
frozen_graph = tf.compat.v1.graph_util.convert_variables_to_constants(
sess, input_graph_def, output_names, freeze_var_names)
trt_graph = trt.create_inference_graph(
# frozen model
input_graph_def=frozen_graph,
outputs=output_node_list,
# specify the max workspace
max_workspace_size_bytes=500000000,
# precision, can be "FP32" (32 floating point precision) or "FP16"
precision_mode=precision,
is_dynamic_op=True)
# Finally we serialize and dump the output graph to the filesystem
with tf.gfile.GFile(model_save_path, 'wb') as f:
f.write(trt_graph.SerializeToString())
print("TensorRT model is successfully stored! \n")
is_dynamic_op=True
已经帮助转换模型(现在说它已成功存储),但我仍然无法在 docker TensorRT 服务器中加载它。
我正在使用 nvcr.io/nvidia/tensorflow:19.10-py3 容器转换模型和 nvcr.io/nvidia/tensorrtserver:19.10-py3 容器用于 TensorRT 服务器。
只是不要将您的模型转换为 TensorRT。
with tf.Graph().as_default():
with tf.Session() as sess:
graph = sess.graph
K.set_session(sess)
K.set_learning_phase(0)
inference_model = create_model(num_classes=num_classes)
load_model()
# Find output nodes
outputs, output_node_list = get_nodes_from_model(inference_model.outputs)
# find input nodes
inputs, input_node_list = get_nodes_from_model(inference_model.inputs)
generate_config()
with sess.as_default():
freeze_var_names = list(set(v.op.name for v in tf.global_variables()).difference(None or []))
output_names = output_node_list or []
output_names += [v.op.name for v in tf.global_variables()]
input_graph_def = graph.as_graph_def()
for node in input_graph_def.node:
# print(node.name)
node.device = ""
frozen_graph = tf.compat.v1.graph_util.convert_variables_to_constants(
sess, input_graph_def, output_names, freeze_var_names)
# Finally we serialize and dump the output graph to the filesystem
with tf.gfile.GFile(model_save_path, 'wb') as f:
f.write(frozen_graph.SerializeToString())