在同一张图中构建多个模型
Building multiple models in the same graph
我正在尝试构建两个预测不同输出类型的相似模型。一个在两个类别之间进行预测,另一个有六个输出类别。它们的输入相同,都是 LSTM RNN。
我在他们的每个文件中将训练和预测分开成单独的函数,model1.py,model2.py。
我错误地将每个模型中的变量命名为相同的东西,因此当我分别从 model1 和 model2 调用 predict1 和 predict2 时,我得到以下名称 space 错误:
ValueError:变量 W 已存在,不允许。您的意思是在 VarScope 中设置 reuse=True 吗?最初定义于:
其中 W 是权重矩阵的名称。
是否有好的方法 运行 来自同一个地方的这些预测?我试图重命名所涉及的变量,但仍然出现以下错误。似乎不可能在其创建时命名一个 lstm_cell,是吗?
ValueError: Variable RNN/BasicLSTMCell/Linear/Matrix already exists
编辑:在预测文件中围绕 model1pred 和 model2pred 确定范围后,我在调用 model1pred() 然后调用 model2pred()
时收到以下错误
tensorflow.python.framework.errors.NotFoundError: Tensor name model1/model1/BasicLSTMCell/Linear/Matrix" not found in checkpoint files './variables/model1.chk
编辑:代码包含在此处。 model2.py 中的代码丢失但等同于 model1.py 中的代码,除了 n_classes=2,并且在 dynamicRNN 函数和内部 pred 中,范围设置为 'model2'.
解决方案:问题是保存程序试图从第一次 pred() 执行中恢复包含的变量的图形。我能够在不同的图中包装 pred 函数的调用来解决这个问题,从而消除了对变量作用域的需求。
在收集预测文件中:
def model1pred(test_x, test_seqlen):
from model1 import pred
with tf.Graph().as_default():
return pred(test_x, test_seqlen)
def model2pred(test_x, test_seqlen):
from model2 import pred
with tf.Graph().as_default():
return pred(test_x, test_seqlen)
##Import test_x, test_seqlen
probs1, preds1 = model1pred(test_x, test_seq)
probs2, cpreds2 = model2Pred(test_x, test_seq)
在model1.py
def dynamicRNN(x, seqlen, weights, biases):
n_steps = 10
n_input = 14
n_classes = 6
n_hidden = 100
# Prepare data shape to match `rnn` function requirements
# Current data input shape: (batch_size, n_steps, n_input)
# Required shape: 'n_steps' tensors list of shape (batch_size, n_input)
# Permuting batch_size and n_steps
x = tf.transpose(x, [1, 0, 2])
# Reshaping to (n_steps*batch_size, n_input)
x = tf.reshape(x, [-1,n_input])
# Split to get a list of 'n_steps' tensors of shape (batch_size, n_input)
x = tf.split(0, n_steps, x)
# Define a lstm cell with tensorflow
lstm_cell = rnn_cell.BasicLSTMCell(n_hidden, forget_bias=1.0)
# Get lstm cell output, providing 'sequence_length' will perform dynamic calculation.
outputs, states = tf.nn.rnn(lstm_cell, x, dtype=tf.float32, sequence_length=seqlen)
# When performing dynamic calculation, we must retrieve the last
# dynamically computed output, i.e, if a sequence length is 10, we need
# to retrieve the 10th output.
# However TensorFlow doesn't support advanced indexing yet, so we build
# a custom op that for each sample in batch size, get its length and
# get the corresponding relevant output.
# 'outputs' is a list of output at every timestep, we pack them in a Tensor
# and change back dimension to [batch_size, n_step, n_input]
outputs = tf.pack(outputs)
outputs = tf.transpose(outputs, [1, 0, 2])
# Hack to build the indexing and retrieve the right output.
batch_size = tf.shape(outputs)[0]
# Start indices for each sample
index = tf.range(0, batch_size) * n_steps + (seqlen - 1)
# Indexing
outputs = tf.gather(tf.reshape(outputs, [-1, n_hidden]), index)
# Linear activation, using outputs computed above
return tf.matmul(outputs, weights['out']) + biases['out']
def pred(test_x, test_seqlen):
with tf.Session() as sess:
n_steps = 10
n_input = 14
n_classes = 6
n_hidden = 100
weights = {'out': tf.Variable(tf.random_normal([n_hidden, n_classes]), name='W1')}
biases = {'out': tf.Variable(tf.random_normal([n_classes]), name='b1')}
x = tf.placeholder("float", [None, n_steps, n_input])
y = tf.placeholder("float", [None, n_classes])
seqlen = tf.placeholder(tf.int32, [None])
pred = dynamicRNN(x, seqlen, weights, biases)
saver = tf.train.Saver(tf.all_variables())
y_p =tf.argmax(pred,1)
init = tf.initialize_all_variables()
sess.run(init)
saver.restore(sess,'./variables/model1.chk')
y_prob, y_pred= sess.run([pred, y_p], feed_dict={x: test_x, seqlen: test_seqlen})
y_prob = np.array([softmax(x) for x in y_prob])
return y_prob, y_pred
'
您可以通过在两段模型构造代码周围添加 with tf.variable_scope():
块来完成此操作。这具有为变量名称添加不同前缀的效果,从而避免了冲突。
例如(使用问题中定义的 model1pred()
和 model2pred()
函数):
with tf.variable_scope('model1'):
# Variables created in here will be named 'model1/W', etc.
probs1, preds1 = model1pred(test_x, test_seq)
with tf.variable_scope('model2'):
# Variables created in here will be named 'model2/W', etc.
probs2, cpreds2 = model2Pred(test_x, test_seq)
更多详细信息,请参阅深入HOWTO on variable sharing in TensorFlow。
我正在尝试构建两个预测不同输出类型的相似模型。一个在两个类别之间进行预测,另一个有六个输出类别。它们的输入相同,都是 LSTM RNN。
我在他们的每个文件中将训练和预测分开成单独的函数,model1.py,model2.py。
我错误地将每个模型中的变量命名为相同的东西,因此当我分别从 model1 和 model2 调用 predict1 和 predict2 时,我得到以下名称 space 错误: ValueError:变量 W 已存在,不允许。您的意思是在 VarScope 中设置 reuse=True 吗?最初定义于:
其中 W 是权重矩阵的名称。
是否有好的方法 运行 来自同一个地方的这些预测?我试图重命名所涉及的变量,但仍然出现以下错误。似乎不可能在其创建时命名一个 lstm_cell,是吗?
ValueError: Variable RNN/BasicLSTMCell/Linear/Matrix already exists
编辑:在预测文件中围绕 model1pred 和 model2pred 确定范围后,我在调用 model1pred() 然后调用 model2pred()
时收到以下错误tensorflow.python.framework.errors.NotFoundError: Tensor name model1/model1/BasicLSTMCell/Linear/Matrix" not found in checkpoint files './variables/model1.chk
编辑:代码包含在此处。 model2.py 中的代码丢失但等同于 model1.py 中的代码,除了 n_classes=2,并且在 dynamicRNN 函数和内部 pred 中,范围设置为 'model2'.
解决方案:问题是保存程序试图从第一次 pred() 执行中恢复包含的变量的图形。我能够在不同的图中包装 pred 函数的调用来解决这个问题,从而消除了对变量作用域的需求。
在收集预测文件中:
def model1pred(test_x, test_seqlen):
from model1 import pred
with tf.Graph().as_default():
return pred(test_x, test_seqlen)
def model2pred(test_x, test_seqlen):
from model2 import pred
with tf.Graph().as_default():
return pred(test_x, test_seqlen)
##Import test_x, test_seqlen
probs1, preds1 = model1pred(test_x, test_seq)
probs2, cpreds2 = model2Pred(test_x, test_seq)
在model1.py
def dynamicRNN(x, seqlen, weights, biases):
n_steps = 10
n_input = 14
n_classes = 6
n_hidden = 100
# Prepare data shape to match `rnn` function requirements
# Current data input shape: (batch_size, n_steps, n_input)
# Required shape: 'n_steps' tensors list of shape (batch_size, n_input)
# Permuting batch_size and n_steps
x = tf.transpose(x, [1, 0, 2])
# Reshaping to (n_steps*batch_size, n_input)
x = tf.reshape(x, [-1,n_input])
# Split to get a list of 'n_steps' tensors of shape (batch_size, n_input)
x = tf.split(0, n_steps, x)
# Define a lstm cell with tensorflow
lstm_cell = rnn_cell.BasicLSTMCell(n_hidden, forget_bias=1.0)
# Get lstm cell output, providing 'sequence_length' will perform dynamic calculation.
outputs, states = tf.nn.rnn(lstm_cell, x, dtype=tf.float32, sequence_length=seqlen)
# When performing dynamic calculation, we must retrieve the last
# dynamically computed output, i.e, if a sequence length is 10, we need
# to retrieve the 10th output.
# However TensorFlow doesn't support advanced indexing yet, so we build
# a custom op that for each sample in batch size, get its length and
# get the corresponding relevant output.
# 'outputs' is a list of output at every timestep, we pack them in a Tensor
# and change back dimension to [batch_size, n_step, n_input]
outputs = tf.pack(outputs)
outputs = tf.transpose(outputs, [1, 0, 2])
# Hack to build the indexing and retrieve the right output.
batch_size = tf.shape(outputs)[0]
# Start indices for each sample
index = tf.range(0, batch_size) * n_steps + (seqlen - 1)
# Indexing
outputs = tf.gather(tf.reshape(outputs, [-1, n_hidden]), index)
# Linear activation, using outputs computed above
return tf.matmul(outputs, weights['out']) + biases['out']
def pred(test_x, test_seqlen):
with tf.Session() as sess:
n_steps = 10
n_input = 14
n_classes = 6
n_hidden = 100
weights = {'out': tf.Variable(tf.random_normal([n_hidden, n_classes]), name='W1')}
biases = {'out': tf.Variable(tf.random_normal([n_classes]), name='b1')}
x = tf.placeholder("float", [None, n_steps, n_input])
y = tf.placeholder("float", [None, n_classes])
seqlen = tf.placeholder(tf.int32, [None])
pred = dynamicRNN(x, seqlen, weights, biases)
saver = tf.train.Saver(tf.all_variables())
y_p =tf.argmax(pred,1)
init = tf.initialize_all_variables()
sess.run(init)
saver.restore(sess,'./variables/model1.chk')
y_prob, y_pred= sess.run([pred, y_p], feed_dict={x: test_x, seqlen: test_seqlen})
y_prob = np.array([softmax(x) for x in y_prob])
return y_prob, y_pred
'
您可以通过在两段模型构造代码周围添加 with tf.variable_scope():
块来完成此操作。这具有为变量名称添加不同前缀的效果,从而避免了冲突。
例如(使用问题中定义的 model1pred()
和 model2pred()
函数):
with tf.variable_scope('model1'):
# Variables created in here will be named 'model1/W', etc.
probs1, preds1 = model1pred(test_x, test_seq)
with tf.variable_scope('model2'):
# Variables created in here will be named 'model2/W', etc.
probs2, cpreds2 = model2Pred(test_x, test_seq)
更多详细信息,请参阅深入HOWTO on variable sharing in TensorFlow。