TensorFlow Lite 无法识别 op VarHandleOp

Question

我正在尝试将 TF 模型转换为 TFLite。模型保存为 .pb 格式，我用下面的代码转换了它：

import os
import tensorflow as tf
from tensorflow.core.protobuf import meta_graph_pb2

export_dir = os.path.join('export_dir', '0')
if not os.path.exists('export_dir'):
    os.mkdir('export_dir')

tf.compat.v1.enable_control_flow_v2()
tf.compat.v1.enable_v2_tensorshape()

# I took this function from a tutorial on the TF website
def wrap_frozen_graph(graph_def, inputs, outputs):
    def _imports_graph_def():
        tf.compat.v1.import_graph_def(graph_def, name="")
    wrapped_import = tf.compat.v1.wrap_function(_imports_graph_def, [])
    import_graph = wrapped_import.graph
    return wrapped_import.prune(
            inputs, outputs)

graph_def = tf.compat.v1.GraphDef()
loaded = graph_def.ParseFromString(open(os.path.join(export_dir, 'saved_model.pb'),'rb').read())

concrete_func = wrap_frozen_graph(
        graph_def, inputs=['extern_data/placeholders/data/data:0', 'extern_data/placeholders/data/data_dim0_size:0'],
    outputs=['output/output_batch_major:0'])
concrete_func.inputs[0].set_shape([None, 50])
concrete_func.inputs[1].set_shape([None])
concrete_func.outputs[0].set_shape([None, 100])

converter = tf.lite.TFLiteConverter.from_concrete_functions([concrete_func])
converter.experimental_new_converter = True
converter.post_training_quantize=True
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS,
                                               tf.lite.OpsSet.SELECT_TF_OPS]
converter.allow_custom_ops=True

tflite_model = converter.convert()

# Save the model.
if not os.path.exists('tflite'):
    os.mkdir('tflite')
output_model = os.path.join('tflite', 'model.tflite')
with open(output_model, 'wb') as f:
     f.write(tflite_model)

但是，当我尝试对这个模型使用 intepretere 时，出现以下错误：

INFO: TfLiteFlexDelegate delegate: 8 nodes delegated out of 970 nodes with 3 partitions.

INFO: TfLiteFlexDelegate delegate: 0 nodes delegated out of 4 nodes with 0 partitions.

INFO: TfLiteFlexDelegate delegate: 3 nodes delegated out of 946 nodes with 1 partitions.

INFO: TfLiteFlexDelegate delegate: 0 nodes delegated out of 1 nodes with 0 partitions.

INFO: TfLiteFlexDelegate delegate: 3 nodes delegated out of 16 nodes with 2 partitions.

Traceback (most recent call last):
  File "/path/to/tflite_interpreter.py", line 9, in <module>
    interpreter.allocate_tensors()
  File "/path/to/lib/python3.6/site-packages/tensorflow/lite/python/interpreter.py", line 243, in allocate_tensors
    return self._interpreter.AllocateTensors()
RuntimeError: Encountered unresolved custom op: VarHandleOp.Node number 0 (VarHandleOp) failed to prepare.

现在，我在代码中找不到任何 VarHandleOp，我发现它实际上在 tensorflow (https://www.tensorflow.org/api_docs/python/tf/raw_ops/VarHandleOp) 中。那么，为什么 TFLite 无法识别它？

Answer 1

按照 SO 指南的建议，在模型转换的情况下提供最小的可重现示例当然很困难，但是问题会从更好的指针中受益。例如，与其说“我从 TF 网站上的教程中获取了此功能”，不如为教程提供 link 是一个更好的主意。 TF网站非常庞大

您所指的教程可能来自 section on migrating from TF1 to TF2，特别是处理原始图形文件的部分。至关重要的注意事项是

if you have a "Frozen graph" (a tf.Graph where the variables have been turned into constants)

（粗体突出显示是我的）。显然，您的图表包含 VarHandleOp（这同样适用于 Variable 和 VariableV2 节点），并且根据此定义没有“冻结”。您的一般方法是有道理的，但您需要一个图表，其中包含 Const 节点形式的变量的实际训练值。您在训练时需要变量，但在推理时需要变量，并且应该被烘焙到图中。 TFLite 作为一个 inference-time 框架，不支持变量。

你的其他想法似乎不错。 TFLiteConverter.from_concrete_functions 目前恰好需要一个 concrete_function，但这是您通过包装图表得到的结果。如果足够幸运，它可能会起作用。

有一个实用程序 tensorflow/python/tools/freeze_graph.py 会尽力用最新检查点文件中的常量替换 Graph.pb 中的变量。如果你查看它的代码，要么使用保存的元图（checkpoint_name.meta）文件，要么将工具指向训练目录，这样可以消除很多猜测；另外，我认为提供模型目录是将单个冻结图作为分片模型的唯一方法。

我注意到您在示例中仅使用 input 代替了 tf.nest.map_structure(import_graph.as_graph_element, inputs)。您可能有其他原因，但如果您这样做是因为 as_graph_element 抱怨 datatype/shape，这可能会通过适当冻结图表来解决。您从冻结图中获得的 concrete_function 将对其输入形状和数据类型有一个很好的了解。一般来说，需要手动设置它们是意想不到的，而且你这样做的事实对我来说似乎很奇怪（但我并没有声称对 TF 的这个黑暗角落有广泛的经验）。

map_structure 有一个跳过检查的关键字参数。

TensorFlow Lite 无法识别 op VarHandleOp

TensorFlow Lite does not recognize op VarHandleOp

python

tensorflow

tensorflow-lite

tensorflow2.0