TensorFlow:如何使用 TensorHub 模块导出估算器?

TensorFlow: how to export estimator using TensorHub module?

我有一个使用 TensorHub text_embedding 列的估算器,如下所示:

my_dataframe = pandas.DataFrame(columns=["title"})
# populate data
labels = [] 
# populate labels with 0|1
embedded_text_feature_column = hub.text_embedding_column(
    key="title" 
    ,module_spec="https://tfhub.dev/google/nnlm-en-dim128-with-normalization/1")


estimator = tf.estimator.LinearClassifier(
    feature_columns = [ embedded_text_feature_column ]
    ,optimizer=tf.train.FtrlOptimizer(
        learning_rate=0.1
        ,l1_regularization_strength=1.0
    )
    ,model_dir=model_dir
)

estimator.train(
    input_fn=tf.estimator.inputs.pandas_input_fn(
        x=my_dataframe
        ,y=labels
        ,batch_size=128
        ,num_epochs=None
        ,shuffle=True
        ,num_threads=5
    )
    ,steps=5000
)
export(estimator, "/tmp/my_model")

如何导出和提供模型,以便它接受字符串作为预测的输入?我有一个 serving_input_receiver_fn 如下,并尝试了很多次,但我很困惑它需要什么样子才能提供它(比如 saved_model_cli)并打电话它以标题字符串(或简单的 JSON 结构)作为输入。

def export(estimator, dir_path):
    def serving_input_receiver_fn():
        feature_spec = tf.feature_column.make_parse_example_spec(hub.text_embedding_column(
        key="title" 
        ,module_spec="https://tfhub.dev/google/nnlm-en-dim128-with-normalization/1"))
        return tf.estimator.export.build_parsing_serving_input_receiver_fn(feature_spec)


    estimator.export_savedmodel(
        export_dir_base=dir_path
        ,serving_input_receiver_fn=serving_input_receiver_fn()
    )

如果您想提供原始字符串,您可能需要考虑使用 raw input receiver。此代码:

feature_placeholder = {'title': tf.placeholder('string', [1], name='title_placeholder')}
serving_input_fn = tf.estimator.export.build_raw_serving_input_receiver_fn(feature_placeholder)

estimator.export_savedmodel(dir_path, serving_input_fn)

将根据 SavedModel CLI 为您提供具有以下输入规范的 SavedModel

saved_model_cli show --dir ./ --tag_set serve --signature_def serving_default

The given SavedModel SignatureDef contains the following input(s):
  inputs['inputs'] tensor_info:
    dtype: DT_STRING
    shape: (-1)
    name: title_placeholder_1:0
The given SavedModel SignatureDef contains the following output(s):
  outputs['classes'] tensor_info:
    dtype: DT_STRING
    shape: (-1, 2)
    name: linear/head/Tile:0
  outputs['scores'] tensor_info:
    dtype: DT_FLOAT
    shape: (-1, 2)
    name: linear/head/predictions/probabilities:0

您可以向 CLI 提供 python 表达式来为模型提供输入以验证它是否有效:

saved_model_cli run --dir ./ --tag_set serve --signature_def \
serving_default --input_exprs "inputs=['this is a test sentence']"

Result for output key classes:
[[b'0' b'1']]
Result for output key scores:
[[0.5123377 0.4876624]]