向 tf predictor.from_saved_model() 提供示例,用于使用 tf hub 模块训练的估算器

Feeding example to tf predictor.from_saved_model() for estimator trained with tf hub module

我尝试导出与tf hub modules, and then infer a prediction from it for a single string example using predictor.from_saved_model(). I saw some examples类似想法的文本分类模型,但仍然无法使其适用于使用tf hub模块构建特征的情况。这是我的工作:

        train_input_fn = tf.estimator.inputs.pandas_input_fn(
        train_df, train_df['label_ids'], num_epochs= None, shuffle=True)

    # Prediction on the whole training set.
    predict_train_input_fn = tf.estimator.inputs.pandas_input_fn(
        train_df, train_df['label_ids'], shuffle=False)

    embedded_text_feature_column = hub.text_embedding_column(
        key='sentence',
        module_spec='https://tfhub.dev/google/nnlm-de-dim128/1')

    #Estimator
    estimator = tf.estimator.DNNClassifier(
        hidden_units=[500, 100],
        feature_columns=[embedded_text_feature_column],
        n_classes=num_of_class,
        optimizer=tf.train.AdagradOptimizer(learning_rate=0.003) )

    # Training
    estimator.train(input_fn=train_input_fn, steps=1000)

    #prediction on training set
    train_eval_result = estimator.evaluate(input_fn=predict_train_input_fn)

    print('Training set accuracy: {accuracy}'.format(**train_eval_result))

    feature_spec = tf.feature_column.make_parse_example_spec([embedded_text_feature_column])
    serving_input_receiver_fn = tf.estimator.export.build_parsing_serving_input_receiver_fn(feature_spec)

    export_dir_base = self.cfg['model_path']
    servable_model_path = estimator.export_savedmodel(export_dir_base, serving_input_receiver_fn)

    # Example message for inference
    message = "Was ist denn los"
    saved_model_predictor = predictor.from_saved_model(export_dir=servable_model_path)
    content_tf_list = tf.train.BytesList(value=[str.encode(message)])
    example = tf.train.Example(
            features=tf.train.Features(
                feature={
                    'sentence': tf.train.Feature(
                        bytes_list=content_tf_list
                    )
                }
            )
        )

    with tf.python_io.TFRecordWriter('the_message.tfrecords') as writer:
        writer.write(example.SerializeToString())

    reader = tf.TFRecordReader()
    data_path = 'the_message.tfrecords'
    filename_queue = tf.train.string_input_producer([data_path], num_epochs=1)
    _, serialized_example = reader.read(filename_queue)
    output_dict = saved_model_predictor({'inputs': [serialized_example]})

并且输出:

Traceback (most recent call last):
  File "/Users/dimitrs/component-pythia/src/pythia.py", line 321, in _train
    model = algo.generate_model(samples, generation_id)
  File "/Users/dimitrs/component-pythia/src/algorithm_layer/algorithm.py", line 56, in generate_model
    model = self._process_training(samples, generation)
  File "/Users/dimitrs/component-pythia/src/algorithm_layer/tf_hub_classifier.py", line 91, in _process_training
    output_dict = saved_model_predictor({'inputs': [serialized_example]})
  File "/Users/dimitrs/anaconda3/envs/pythia/lib/python3.6/site-packages/tensorflow/contrib/predictor/predictor.py", line 77, in __call__
    return self._session.run(fetches=self.fetch_tensors, feed_dict=feed_dict)
  File "/Users/dimitrs/anaconda3/envs/pythia/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 900, in run
    run_metadata_ptr)
  File "/Users/dimitrs/anaconda3/envs/pythia/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1135, in _run
    feed_dict_tensor, options, run_metadata)
  File "/Users/dimitrs/anaconda3/envs/pythia/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1316, in _do_run
    run_metadata)
  File "/Users/dimitrs/anaconda3/envs/pythia/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1335, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InternalError: Unable to get element as bytes.

serialized_example 不是 serving_input_receiver_fn 建议的正确输入吗?

所以,我只需要 serialized_example = example.SerializeToString() 将示例写入文件需要在读回之前启动一个会话。简单的序列化就够了:

    # Example message for inference
    message = "Was ist denn los"
    saved_model_predictor = predictor.from_saved_model(export_dir=servable_model_path)
    content_tf_list = tf.train.BytesList(value=[message.encode('utf-8')])
    sentence = tf.train.Feature(bytes_list=content_tf_list)
    sentence_dict = {'sentence': sentence}
    features = tf.train.Features(feature=sentence_dict)

    example = tf.train.Example(features=features)

    serialized_example = example.SerializeToString()
    output_dict = saved_model_predictor({'inputs': [serialized_example]})