通过 CloudML 获取 TFrecords 的批量预测
Getting batch predictions for TFrecords via CloudML
我关注 this great tutorial 并成功训练了一个模型(在 CloudML 上)。我的代码也进行离线预测,但现在我尝试使用 Cloud ML 进行预测并遇到一些问题。
为了部署我的模型,我遵循 this tutorial. Now I have a code that generates TFRecords
via apache_beam.io.WriteToTFRecord
and I want to make predictions for those TFRecords
. To do so I am following this article,我的命令如下所示:
gcloud ml-engine jobs submit prediction $JOB_ID --model $MODEL --input-paths gs://"$FILE_INPUT".gz --output-path gs://"$OUTPUT"/predictions --region us-west1 --data-format TF_RECORD_GZIP
但我只得到错误:
'Exception during running the graph: Expected serialized to be a scalar, got shape: [64]
它似乎需要不同格式的数据。我找到了 JSON here 的格式规范,但找不到如何使用 TFrecords 进行操作。
更新:这里是saved_model_cli show --all --dir
的输出
MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:
signature_def['prediction']:
The given SavedModel SignatureDef contains the following input(s):
inputs['example_proto'] tensor_info:
dtype: DT_STRING
shape: unknown_rank
name: input:0
The given SavedModel SignatureDef contains the following output(s):
outputs['probability'] tensor_info:
dtype: DT_FLOAT
shape: (1, 1)
name: probability:0
Method name is: tensorflow/serving/predict
signature_def['serving_default']:
The given SavedModel SignatureDef contains the following input(s):
inputs['example_proto'] tensor_info:
dtype: DT_STRING
shape: unknown_rank
name: input:0
The given SavedModel SignatureDef contains the following output(s):
outputs['probability'] tensor_info:
dtype: DT_FLOAT
shape: (1, 1)
name: probability:0
Method name is: tensorflow/serving/predict
当你导出你的模型时,你需要确保它是"batchable",即输入占位符的外部尺寸有shape=[None]
,例如
input = tf.Placeholder(dtype=tf.string, shape=[None])
...
这可能需要稍微修改图表。例如,我看到您的输出形状被硬编码为 [1,1]。最外面的维度应该是 None
,这可能会在您修复占位符时自动发生,或者可能需要其他更改。
鉴于输出的名称是 probabilities
,我还希望最里面的维度 >1,即被预测的 类 的数量,所以像 [None, NUM_CLASSES]
.
我关注 this great tutorial 并成功训练了一个模型(在 CloudML 上)。我的代码也进行离线预测,但现在我尝试使用 Cloud ML 进行预测并遇到一些问题。
为了部署我的模型,我遵循 this tutorial. Now I have a code that generates TFRecords
via apache_beam.io.WriteToTFRecord
and I want to make predictions for those TFRecords
. To do so I am following this article,我的命令如下所示:
gcloud ml-engine jobs submit prediction $JOB_ID --model $MODEL --input-paths gs://"$FILE_INPUT".gz --output-path gs://"$OUTPUT"/predictions --region us-west1 --data-format TF_RECORD_GZIP
但我只得到错误:
'Exception during running the graph: Expected serialized to be a scalar, got shape: [64]
它似乎需要不同格式的数据。我找到了 JSON here 的格式规范,但找不到如何使用 TFrecords 进行操作。
更新:这里是saved_model_cli show --all --dir
MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:
signature_def['prediction']:
The given SavedModel SignatureDef contains the following input(s):
inputs['example_proto'] tensor_info:
dtype: DT_STRING
shape: unknown_rank
name: input:0
The given SavedModel SignatureDef contains the following output(s):
outputs['probability'] tensor_info:
dtype: DT_FLOAT
shape: (1, 1)
name: probability:0
Method name is: tensorflow/serving/predict
signature_def['serving_default']:
The given SavedModel SignatureDef contains the following input(s):
inputs['example_proto'] tensor_info:
dtype: DT_STRING
shape: unknown_rank
name: input:0
The given SavedModel SignatureDef contains the following output(s):
outputs['probability'] tensor_info:
dtype: DT_FLOAT
shape: (1, 1)
name: probability:0
Method name is: tensorflow/serving/predict
当你导出你的模型时,你需要确保它是"batchable",即输入占位符的外部尺寸有shape=[None]
,例如
input = tf.Placeholder(dtype=tf.string, shape=[None])
...
这可能需要稍微修改图表。例如,我看到您的输出形状被硬编码为 [1,1]。最外面的维度应该是 None
,这可能会在您修复占位符时自动发生,或者可能需要其他更改。
鉴于输出的名称是 probabilities
,我还希望最里面的维度 >1,即被预测的 类 的数量,所以像 [None, NUM_CLASSES]
.