通过 CloudML 获取 TFrecords 的批量预测

Question

我关注 this great tutorial 并成功训练了一个模型（在 CloudML 上）。我的代码也进行离线预测，但现在我尝试使用 Cloud ML 进行预测并遇到一些问题。

为了部署我的模型，我遵循 this tutorial. Now I have a code that generates TFRecords via apache_beam.io.WriteToTFRecord and I want to make predictions for those TFRecords. To do so I am following this article，我的命令如下所示：

gcloud ml-engine jobs submit prediction $JOB_ID --model $MODEL --input-paths gs://"$FILE_INPUT".gz --output-path gs://"$OUTPUT"/predictions --region us-west1 --data-format TF_RECORD_GZIP

但我只得到错误： 'Exception during running the graph: Expected serialized to be a scalar, got shape: [64]

它似乎需要不同格式的数据。我找到了 JSON here 的格式规范，但找不到如何使用 TFrecords 进行操作。

更新：这里是saved_model_cli show --all --dir

的输出

MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:

signature_def['prediction']:
  The given SavedModel SignatureDef contains the following input(s):
    inputs['example_proto'] tensor_info:
    dtype: DT_STRING
    shape: unknown_rank
    name: input:0
  The given SavedModel SignatureDef contains the following output(s):
    outputs['probability'] tensor_info:
    dtype: DT_FLOAT
    shape: (1, 1)
    name: probability:0
  Method name is: tensorflow/serving/predict

signature_def['serving_default']:
  The given SavedModel SignatureDef contains the following input(s):
    inputs['example_proto'] tensor_info:
    dtype: DT_STRING
    shape: unknown_rank
    name: input:0
  The given SavedModel SignatureDef contains the following output(s):
    outputs['probability'] tensor_info:
    dtype: DT_FLOAT
    shape: (1, 1)
    name: probability:0
  Method name is: tensorflow/serving/predict

Answer 1

当你导出你的模型时，你需要确保它是"batchable"，即输入占位符的外部尺寸有shape=[None]，例如

input = tf.Placeholder(dtype=tf.string, shape=[None])
...

这可能需要稍微修改图表。例如，我看到您的输出形状被硬编码为 [1,1]。最外面的维度应该是 None，这可能会在您修复占位符时自动发生，或者可能需要其他更改。

鉴于输出的名称是 probabilities，我还希望最里面的维度 >1，即被预测的类的数量，所以像 [None, NUM_CLASSES].

通过 CloudML 获取 TFrecords 的批量预测

Getting batch predictions for TFrecords via CloudML

python

machine-learning

tensorflow

google-cloud-ml

tfrecord