保存张量流模型:找不到变量

Save tensorflow model: variables not found

我正在尝试使用 tf.train.Saver 保存和恢复张量流模型。我想我已经正确地完成了保存部分,但是当我尝试恢复模型时我得到了这个错误:

2017-07-03 15:55:14.824767: W tensorflow/core/framework/op_kernel.cc:1152] Not found: Key train/beta2_power not found in checkpoint
2017-07-03 15:55:14.824796: W tensorflow/core/framework/op_kernel.cc:1152] Not found: Key Biases-outputlayer not found in checkpoint
2017-07-03 15:55:14.825913: W tensorflow/core/framework/op_kernel.cc:1152] Not found: Key Biases-outputlayer/Adam not found in checkpoint
2017-07-03 15:55:14.826588: W tensorflow/core/framework/op_kernel.cc:1152] Not found: Key Biases-outputlayer/Adam_1 not found in checkpoint
2017-07-03 15:55:14.827369: W tensorflow/core/framework/op_kernel.cc:1152] Not found: Key train/beta1_power not found in checkpoint
2017-07-03 15:55:14.828101: W tensorflow/core/framework/op_kernel.cc:1152] Not found: Key Biases1 not found in checkpoint
2017-07-03 15:55:14.828973: W tensorflow/core/framework/op_kernel.cc:1152] Not found: Key Biases1/Adam not found in checkpoint
2017-07-03 15:55:14.829151: W tensorflow/core/framework/op_kernel.cc:1152] Not found: Key Biases1/Adam_1 not found in checkpoint
2017-07-03 15:55:14.830308: W tensorflow/core/framework/op_kernel.cc:1152] Not found: Key Weights5/Adam_1 not found in checkpoint
2017-07-03 15:55:14.830590: W tensorflow/core/framework/op_kernel.cc:1152] Not found: Key Biases2 not found in checkpoint
2017-07-03 15:55:14.831279: W tensorflow/core/framework/op_kernel.cc:1152] Not found: Key Biases2/Adam not found in checkpoint
2017-07-03 15:55:14.832268: W tensorflow/core/framework/op_kernel.cc:1152] Not found: Key Biases2/Adam_1 not found in checkpoint
2017-07-03 15:55:14.832558: W tensorflow/core/framework/op_kernel.cc:1152] Not found: Key Weights5/Adam not found in checkpoint
2017-07-03 15:55:14.833052: W tensorflow/core/framework/op_kernel.cc:1152] Not found: Key Biases3 not found in checkpoint
2017-07-03 15:55:14.834195: W tensorflow/core/framework/op_kernel.cc:1152] Not found: Key Biases3/Adam not found in checkpoint
2017-07-03 15:55:14.834228: W tensorflow/core/framework/op_kernel.cc:1152] Not found: Key Biases3/Adam_1 not found in checkpoint
2017-07-03 15:55:14.834629: W tensorflow/core/framework/op_kernel.cc:1152] Not found: Key Biases4 not found in checkpoint
2017-07-03 15:55:14.835986: W tensorflow/core/framework/op_kernel.cc:1152] Not found: Key Weights5 not found in checkpoint
2017-07-03 15:55:14.836128: W tensorflow/core/framework/op_kernel.cc:1152] Not found: Key Biases5 not found in checkpoint
2017-07-03 15:55:14.836423: W tensorflow/core/framework/op_kernel.cc:1152] Not found: Key Biases4/Adam not found in checkpoint
2017-07-03 15:55:14.837906: W tensorflow/core/framework/op_kernel.cc:1152] Not found: Key Biases4/Adam_1 not found in checkpoint
2017-07-03 15:55:14.838055: W tensorflow/core/framework/op_kernel.cc:1152] Not found: Key Weights4/Adam_1 not found in checkpoint
2017-07-03 15:55:14.838388: W tensorflow/core/framework/op_kernel.cc:1152] Not found: Key Biases5/Adam not found in checkpoint
2017-07-03 15:55:14.839666: W tensorflow/core/framework/op_kernel.cc:1152] Not found: Key Biases5/Adam_1 not found in checkpoint
2017-07-03 15:55:14.840299: W tensorflow/core/framework/op_kernel.cc:1152] Not found: Key Weights-outputlayer not found in checkpoint
2017-07-03 15:55:14.840774: W tensorflow/core/framework/op_kernel.cc:1152] Not found: Key Weights4/Adam not found in checkpoint
2017-07-03 15:55:14.841568: W tensorflow/core/framework/op_kernel.cc:1152] Not found: Key Weights-outputlayer/Adam not found in checkpoint
2017-07-03 15:55:14.842312: W tensorflow/core/framework/op_kernel.cc:1152] Not found: Key Weights-outputlayer/Adam_1 not found in checkpoint
2017-07-03 15:55:14.842689: W tensorflow/core/framework/op_kernel.cc:1152] Not found: Key Weights1 not found in checkpoint
2017-07-03 15:55:14.843789: W tensorflow/core/framework/op_kernel.cc:1152] Not found: Key Weights4 not found in checkpoint
2017-07-03 15:55:14.844030: W tensorflow/core/framework/op_kernel.cc:1152] Not found: Key Weights1/Adam not found in checkpoint
2017-07-03 15:55:14.844775: W tensorflow/core/framework/op_kernel.cc:1152] Not found: Key Weights1/Adam_1 not found in checkpoint
2017-07-03 15:55:14.845580: W tensorflow/core/framework/op_kernel.cc:1152] Not found: Key Weights2 not found in checkpoint
2017-07-03 15:55:14.845919: W tensorflow/core/framework/op_kernel.cc:1152] Not found: Key Weights3/Adam_1 not found in checkpoint
2017-07-03 15:55:14.846800: W tensorflow/core/framework/op_kernel.cc:1152] Not found: Key Weights2/Adam not found in checkpoint
2017-07-03 15:55:14.847101: W tensorflow/core/framework/op_kernel.cc:1152] Not found: Key Weights2/Adam_1 not found in checkpoint
2017-07-03 15:55:14.847274: W tensorflow/core/framework/op_kernel.cc:1152] Not found: Key Weights3 not found in checkpoint
2017-07-03 15:55:14.847467: W tensorflow/core/framework/op_kernel.cc:1152] Not found: Key Weights3/Adam not found in checkpoint
Traceback (most recent call last):
  File "predict2.py", line 6, in <module>
    saver.restore(sess,tf.train.latest_checkpoint('./'))
  File "/anaconda/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1457, in restore
    {self.saver_def.filename_tensor_name: save_path})
  File "/anaconda/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 778, in run
    run_metadata_ptr)
  File "/anaconda/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 982, in _run
    feed_dict_string, options, run_metadata)
  File "/anaconda/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1032, in _do_run
    target_list, options, run_metadata)
  File "/anaconda/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1052, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.NotFoundError: Key Biases-outputlayer not found in checkpoint
     [[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]

Caused by op u'save/RestoreV2', defined at:
  File "predict2.py", line 5, in <module>
    saver = tf.train.import_meta_graph(os.getcwd()+'/models/baseDNN.meta')
  File "/anaconda/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1595, in import_meta_graph
    **kwargs)
  File "/anaconda/lib/python2.7/site-packages/tensorflow/python/framework/meta_graph.py", line 499, in import_scoped_meta_graph
    producer_op_list=producer_op_list)
  File "/anaconda/lib/python2.7/site-packages/tensorflow/python/framework/importer.py", line 308, in import_graph_def
    op_def=op_def)
  File "/anaconda/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2336, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/anaconda/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1228, in __init__
    self._traceback = _extract_stack()

NotFoundError (see above for traceback): Key Biases-outputlayer not found in checkpoint
     [[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]

我查看了其他 Whosebug 问题,他们使用 import_meta_graphget_graph_by_tensorrestore 加载回张量流模型,我尝试将其调整为我的代码,但我一直收到这些错误,它说它在检查点中找不到任何变量。

这是保存模型的代码:

TRAIN_KEEP_PROB = 1.0
TEST_KEEP_PROB = 1.0
learning_rate = 0.0001
ne = 10

train = 100
test = 1
num_nodes = 250
len_puzzle = 80
n_nodes_hl1 = num_nodes # hidden layer 1
n_nodes_hl2 = num_nodes
n_nodes_hl3 = num_nodes
n_nodes_hl4 = num_nodes
n_nodes_hl5 = num_nodes

n_classes = 4
batch_size = 100 # load 100 features at a time

x = tf.placeholder('float',[None,TF_SHAPE],name="x_placeholder")
y = tf.placeholder('float',name='y_placeholder')
keep_prob = tf.placeholder('float',name='keep_prob_placeholder')

def neuralNet(data):
    hl_1 = {'weights':tf.Variable(tf.random_normal([TF_SHAPE, n_nodes_hl1]),name='Weights1'),
            'biases':tf.Variable(tf.random_normal([n_nodes_hl1]),name='Biases1')}

    hl_2 = {'weights':tf.Variable(tf.random_normal([n_nodes_hl1, n_nodes_hl2]),name='Weights2'),
            'biases':tf.Variable(tf.random_normal([n_nodes_hl2]),name='Biases2')}

    hl_3 = {'weights':tf.Variable(tf.random_normal([n_nodes_hl2, n_nodes_hl3]),name='Weights3'),
            'biases':tf.Variable(tf.random_normal([n_nodes_hl3]),name='Biases3')}

    hl_4 = {'weights':tf.Variable(tf.random_normal([n_nodes_hl3, n_nodes_hl4]),name='Weights4'),
            'biases':tf.Variable(tf.random_normal([n_nodes_hl4]),name='Biases4')}

    hl_5 = {'weights':tf.Variable(tf.random_normal([n_nodes_hl4, n_nodes_hl5]),name='Weights5'),
            'biases':tf.Variable(tf.random_normal([n_nodes_hl5]),name='Biases5')}

    output_layer = {'weights':tf.Variable(tf.random_normal([n_nodes_hl5, n_classes]),name='Weights-outputlayer'),
            'biases':tf.Variable(tf.random_normal([n_classes]),name='Biases-outputlayer')}

    l1 = tf.add(tf.matmul(data, hl_1['weights']), hl_1['biases'])
    l1 = tf.nn.sigmoid(l1,name='op1')

    l2 = tf.add(tf.matmul(l1, hl_2['weights']), hl_2['biases'])
    l2 = tf.nn.sigmoid(l2,name='op2')

    l3 = tf.add(tf.matmul(l2, hl_3['weights']), hl_3['biases'])
    l3 = tf.nn.sigmoid(l3,name='op3')

    l4 = tf.add(tf.matmul(l3, hl_4['weights']), hl_4['biases'])
    l4 = tf.nn.sigmoid(l4,name='op4')

    l5 = tf.add(tf.matmul(l4, hl_5['weights']), hl_5['biases'])
    l5 = tf.nn.sigmoid(l5,name='op5')

    dropout = tf.nn.dropout(l5,keep_prob, name='op6')
    ol = tf.add(tf.matmul(dropout, output_layer['weights']), output_layer['biases'], name='op7')

    return ol


def train(x):
    prediction = neuralNet(x)
    #print prediction
    with tf.name_scope('cross_entropy'):
        cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=prediction,labels=y))
        tf.summary.scalar('cross_entropy',cost)

    with tf.name_scope('train'):
        optimizer = tf.train.AdamOptimizer(learning_rate).minimize(cost) # learning rate = 0.001

    with tf.name_scope('accuracy'):
        correct = tf.equal(tf.argmax(prediction,1),tf.argmax(y,1))
        accuracy = tf.reduce_mean(tf.cast(correct,'float'))
        tf.summary.scalar('accuracy',accuracy)

    # cycles of feed forward and backprop
    num_epochs = ne


    with tf.Session() as sess:
        saver = tf.train.Saver()
        sess.run(tf.global_variables_initializer())
        merged_summary = tf.summary.merge_all()

        for epoch in range(num_epochs):
            epoch_loss = 0
            for i in range(int(real_X_9.shape[0])/batch_size):#mnist.train.num_examples/batch_size)): # X.shape[0]
                randidx = np.random.choice(real_X_9.shape[0], batch_size, replace=False)
                epoch_x,epoch_y = real_X_9[randidx,:],real_y_9[randidx,:] #mnist.train.next_batch(batch_size) # X,y
                j,c = sess.run([optimizer,cost],feed_dict={x:epoch_x,y:epoch_y,keep_prob:TRAIN_KEEP_PROB})
                if i == 0:
                    [ta] = sess.run([accuracy],feed_dict={x:epoch_x,y:epoch_y,keep_prob:TRAIN_KEEP_PROB})
                    print 'Train Accuracy', ta

                epoch_loss += c
            print '\n','Epoch', epoch + 1, 'completed out of', num_epochs, '\nLoss:',epoch_loss
        saver.save(sess, os.getcwd()+'/models/baseDNN')
train(x)

这是我尝试恢复模型的代码(这是产生错误的代码):

import tensorflow as tf
import os

sess = tf.Session()
saver = tf.train.import_meta_graph(os.getcwd()+'/models/baseDNN.meta')
saver.restore(sess,tf.train.latest_checkpoint('./'))

graph = tf.get_default_graph()

x = graph.get_tensor_by_name('x_placeholder:0')
y = graph.get_tensor_by_name('y_placeholder:0')

op1 = graph.get_tensor_by_name('op1:0')
op2 = graph.get_tensor_by_name('op2:0')
op3 = graph.get_tensor_by_name('op3:0')
op4 = graph.get_tensor_by_name('op4:0')
op5 = graph.get_tensor_by_name('op5:0')
op6 = graph.get_tensor_by_name('op6:0')
op7 = graph.get_tensor_by_name('op7:0')

如有任何帮助,我们将不胜感激。谢谢。

不要认为加载了正确的检查点文件。 检查保存在 /models/baseDNN 目录中的文件。它应该包含 model.ckptmodel.ckpt.meta(或类似的东西)。然后指向正确的文件:

saver = tf.train.import_meta_graph('/path/baseDNN.meta')
saver.restore(sess,'/path/baseDNN')