如何调试"Cloud ML only supports TF 1.0 or above and models saved in SavedModel format."？

Question

我使用 Cloud ML 进行批量预测。我的一些模型有效，而另一些则无效。我如何调试不起作用的模型？我看到的一切都是一堆错误：Cloud ML only supports TF 1.0 or above and models saved in SavedModel format. in prediction.errors_stats-00000-of-00001。 saved_model_cli show --all --dir的输出是（其他工作模型给出相同的输出）

MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:

signature_def['prediction']:
  The given SavedModel SignatureDef contains the following input(s):
    inputs['example_proto'] tensor_info:
    dtype: DT_STRING
    shape: (-1)
    name: input:0
  The given SavedModel SignatureDef contains the following output(s):
    outputs['id'] tensor_info:
    dtype: DT_STRING
    shape: (-1)
    name: id:0
    outputs['probability'] tensor_info:
    dtype: DT_FLOAT
    shape: (-1, 1)
    name: probability:0
  Method name is: tensorflow/serving/predict

signature_def['serving_default']:
  The given SavedModel SignatureDef contains the following input(s):
    inputs['example_proto'] tensor_info:
    dtype: DT_STRING
    shape: (-1)
    name: input:0
  The given SavedModel SignatureDef contains the following output(s):
    outputs['id'] tensor_info:
    dtype: DT_STRING
    shape: (-1)
    name: id:0
    outputs['label'] tensor_info:
    dtype: DT_INT64
    shape: (-1)
    name: label:0
    outputs['probability'] tensor_info:
    dtype: DT_FLOAT
    shape: (-1, 1)
    name: probability:0
  Method name is: tensorflow/serving/predict

更新：我的数据是TF记录的形式，所以我做不到gcloud ml-engine local predict。

Answer 1

(1) 您在部署模型时是否指定了 --runtime-version？默认情况下，它是 1.0，但大概您需要 TensorFlow 版本 1.8 或类似版本。如果您的模型使用在 1.0 之后添加的操作，您可能会收到此消息。

(2) 即使使用 TF 记录，您也可以使用 gcloud ml-engine local predict。据推测，导出的模型有一个维度为 [None] 的字符串张量输入，它直接馈送到 ParseExample 操作中。在这种情况下，您只需遵循标准 JSON API 语法，发送一批包含序列化 tf.Example 记录的字符串（base64 编码它们并使用语法来指示此类）：

  {"instances": [{"b64": base64.b64encode(example1), {"b64": base64.b64encode(example2}}, ...]}

另一个（更好的）选项重新导出它（不必重新训练，您始终可以通过编写包含几行的脚本来从检查点或 SavedModel 导出以加载模型并导出新模型）和而不是使用 build_parsing_transforming_serving_input_receiver_fn 使用 build_default_transforming_serving_input_receiver_fn。那么你的 JSON 很简单：

{"instances": [{"input_name": [10,3,5,6]}]}

如果你只有一个输入，你可以进一步简化为：

{"instances": [[10,3,5,6]]}

Answer 2

原来问题是由于我的模型部署在四核上 CPU。批量预测不起作用。在单核 CPU 上部署模型解决了这个问题。这似乎是一个错误，我报告了。

Answer 3

我遇到了同样的错误，我的命令是：

gcloud ml-engine local predict --model-dir $MODEL_DIR --json-instances $JSON_INSTANCES --verbosity debug

问题是我的 $MODEL_DIR 指向了错误的模型目录。确保模型在 SavedModel format!

如何调试"Cloud ML only supports TF 1.0 or above and models saved in SavedModel format."？

How to debug "Cloud ML only supports TF 1.0 or above and models saved in SavedModel format."?

python

tensorflow

google-cloud-ml