FailedPreconditionError: GetNext() failed after loading a Tensorflow Saved_Model

FailedPreconditionError: GetNext() failed after loading a Tensorflow Saved_Model

我构建了一个专用的 class 来构建、训练、保存然后加载我的模型。使用 tf.saved_model.simple_save 完成保存,然后通过 tf.saved_model.loader.load 恢复。

训练和推理是使用数据集 API 完成的。使用经过训练的模型时一切正常。

但是,如果我恢复保存的模型,推理就会中断并抛出此错误:

FailedPreconditionError (see above for traceback): GetNext() failed because the iterator has not been initialized. Ensure that you have run the initializer operation for this iterator before getting the next element.

[[Node: datasets/cond/IteratorGetNext_1 = IteratorGetNextoutput_shapes=[[?,?,30], [?,5]], output_types=[DT_INT32, DT_INT32], _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

我确定迭代器已初始化(print 按预期显示,请参见下面的代码)。可能与变量所属的图形有关吗?还有其他想法吗?我有点卡在这里

(简体)代码:

class Model():
    def __init__(self):
        self.graph = tf.Graph()
        self.sess = tf.Session(graph=self.graph)
        with self.graph.as_default():
            model.features_data_ph = tf.Placeholder(...)
            model.labels_data_ph = tf.Placeholder(...)

    def build(self):
        with self.graph.as_default():
            self.logits = my_model(self.input_tensor)
            self.loss = my_loss(self.logits, self.labels_tensor)

    def train(self):
        my_training_procedure()

    def set_datasets(self):
        with self.graph.as_default():
            with tf.variable_scope('datasets'):
                self.dataset = tf.data.Dataset.from_tensor_slices((self.features_data_ph, self.labels_data_ph))
                self.iter = self.dataset.make_initializable_iterator()
                self.input_tensor, self.labels_tensor = self.iter.get_next

    def initialize_iterators(self, inference_data):
        with self.graph.as_default():
            feats = inference_data
            labs = np.zeros((len(feats), self.hp.num_classes))
            self.sess.run(self.iter.initializer,
                feed_dict={self.features_data_ph: feats,
                    self.labels_data_ph: labs})
            print('Iterator ready to infer')

    def infer(self, inference_data):
        self.initialize_iterators(inference_data)
        return sess.run(self.logits)

    def save(self, path):
        inputs = {"features_data_ph": self.features_data_ph,
            "labels_data_ph": self.labels_data_ph}
        outputs = {"logits": self.model.logits}
        tf.saved_model.simple_save(self.sess, path)

    @staticmethod
    def restore(path):
        model = Model()
        tf.saved_model.loader.load(model.sess, [tag_constants.SERVING], path)
        model.features_data_ph = model.graph.get_tensor_by_name("features_data_ph:0")
        model.labels_data_ph = model.graph.get_tensor_by_name("labels_data_ph:0")
        model.logits = model.graph.get_tensor_by_name("model/classifier/dense/BiasAdd:0")
        model.set_datasets()
        return model

例程失败:

model1 = Model()
model1.build()
model1.train()
model1.save(model1_path)

...

model2 = Model.restore(model1_path)
model2.infer(some_numpy_array) # Error here, after print, at sess.run()

(恢复模型有效,张量值在原始模型和恢复模型之间匹配)

我 运行 遇到了同样的问题,我认为问题在于您正在初始化一个新的数据集对象,而不是初始化与模型一起保存的迭代器。

尝试:

make_iter = model.get_operation_by_name("YOURPREFIX/MakeIterator")
sess.run(make_iter, feed_dict)
model.infer(some_numpy_array)

我通过更改创建 Dataset

的方式解决了这个问题
iterator = tf.data.Iterator.from_structure(dataset.output_types, dataset.output_shapes)
dataset_init_op = iterator.make_initializer(dataset, name='dataset_init')
...
#retstoring
dataset_init_op = restored_graph.get_operation_by_name('dataset_init')
sess.run(
    dataset_init_op,
    feed_dict={...}
)

那里有一段可用的代码 -> https://vict0rsch.github.io/2018/05/17/restore-tf-model-dataset/

一种简单的方法:在循环之前,添加一行代码:

tf.add_to_collection("saved_model_main_op",tf.group([train_iter], name='legacy_init_op'))

"saved_model_main_op" 已修复。

train_iter是初始化迭代器的opt