TensorFlow:记住下一批的 LSTM 状态(stateful LSTM)
TensorFlow: Remember LSTM state for next batch (stateful LSTM)
给定一个经过训练的 LSTM 模型,我想对单个时间步执行推理,即下面示例中的 seq_length = 1
。在每个时间步之后,需要为下一个 'batch' 记住内部 LSTM(记忆和隐藏)状态。对于推理的最开始,内部 LSTM 状态 init_c, init_h
是在给定输入的情况下计算的。然后将这些存储在传递给 LSTM 的 LSTMStateTuple
对象中。在训练期间,每个时间步都会更新此状态。但是为了推断,我希望 state
在批次之间保存,即只需要在最开始计算初始状态,然后在每个 'batch' 之后保存 LSTM 状态(n= 1).
我发现了这个相关的 Whosebug 问题:. However this only works if state_is_tuple=False
, but this behavior is soon to be deprecated by TensorFlow (see rnn_cell.py). Keras seems to have a nice wrapper to make stateful LSTMs possible but I don't know the best way to achieve this in TensorFlow. This issue on the TensorFlow GitHub is also related to my question: https://github.com/tensorflow/tensorflow/issues/2838
有人对构建有状态 LSTM 模型有什么好的建议吗?
inputs = tf.placeholder(tf.float32, shape=[None, seq_length, 84, 84], name="inputs")
targets = tf.placeholder(tf.float32, shape=[None, seq_length], name="targets")
num_lstm_layers = 2
with tf.variable_scope("LSTM") as scope:
lstm_cell = tf.nn.rnn_cell.LSTMCell(512, initializer=initializer, state_is_tuple=True)
self.lstm = tf.nn.rnn_cell.MultiRNNCell([lstm_cell] * num_lstm_layers, state_is_tuple=True)
init_c = # compute initial LSTM memory state using contents in placeholder 'inputs'
init_h = # compute initial LSTM hidden state using contents in placeholder 'inputs'
self.state = [tf.nn.rnn_cell.LSTMStateTuple(init_c, init_h)] * num_lstm_layers
outputs = []
for step in range(seq_length):
if step != 0:
scope.reuse_variables()
# CNN features, as input for LSTM
x_t = # ...
# LSTM step through time
output, self.state = self.lstm(x_t, self.state)
outputs.append(output)
Tensorflow,在 RNN 中保存状态的最佳方式?实际上是我最初的问题。下面的代码是我如何使用状态元组。
with tf.variable_scope('decoder') as scope:
rnn_cell = tf.nn.rnn_cell.MultiRNNCell \
([
tf.nn.rnn_cell.LSTMCell(512, num_proj = 256, state_is_tuple = True),
tf.nn.rnn_cell.LSTMCell(512, num_proj = WORD_VEC_SIZE, state_is_tuple = True)
], state_is_tuple = True)
state = [[tf.zeros((BATCH_SIZE, sz)) for sz in sz_outer] for sz_outer in rnn_cell.state_size]
for t in range(TIME_STEPS):
if t:
last = y_[t - 1] if TRAINING else y[t - 1]
else:
last = tf.zeros((BATCH_SIZE, WORD_VEC_SIZE))
y[t] = tf.concat(1, (y[t], last))
y[t], state = rnn_cell(y[t], state)
scope.reuse_variables()
我没有使用 tf.nn.rnn_cell.LSTMStateTuple
,而是创建了一个工作正常的列表列表。在这个例子中,我没有保存状态。然而,您可以很容易地从变量中创建状态,并只使用赋值来保存值。
我发现将所有层的整个状态保存在一个占位符中是最简单的。
init_state = np.zeros((num_layers, 2, batch_size, state_size))
...
state_placeholder = tf.placeholder(tf.float32, [num_layers, 2, batch_size, state_size])
然后在使用原生tensorflow RNN Api.
之前解压它并创建一个LSTMStateTuples元组
l = tf.unpack(state_placeholder, axis=0)
rnn_tuple_state = tuple(
[tf.nn.rnn_cell.LSTMStateTuple(l[idx][0], l[idx][1])
for idx in range(num_layers)]
)
RNN 传入 API:
cell = tf.nn.rnn_cell.LSTMCell(state_size, state_is_tuple=True)
cell = tf.nn.rnn_cell.MultiRNNCell([cell]*num_layers, state_is_tuple=True)
outputs, state = tf.nn.dynamic_rnn(cell, x_input_batch, initial_state=rnn_tuple_state)
state
- 变量将作为占位符提供给下一批。
给定一个经过训练的 LSTM 模型,我想对单个时间步执行推理,即下面示例中的 seq_length = 1
。在每个时间步之后,需要为下一个 'batch' 记住内部 LSTM(记忆和隐藏)状态。对于推理的最开始,内部 LSTM 状态 init_c, init_h
是在给定输入的情况下计算的。然后将这些存储在传递给 LSTM 的 LSTMStateTuple
对象中。在训练期间,每个时间步都会更新此状态。但是为了推断,我希望 state
在批次之间保存,即只需要在最开始计算初始状态,然后在每个 'batch' 之后保存 LSTM 状态(n= 1).
我发现了这个相关的 Whosebug 问题:state_is_tuple=False
, but this behavior is soon to be deprecated by TensorFlow (see rnn_cell.py). Keras seems to have a nice wrapper to make stateful LSTMs possible but I don't know the best way to achieve this in TensorFlow. This issue on the TensorFlow GitHub is also related to my question: https://github.com/tensorflow/tensorflow/issues/2838
有人对构建有状态 LSTM 模型有什么好的建议吗?
inputs = tf.placeholder(tf.float32, shape=[None, seq_length, 84, 84], name="inputs")
targets = tf.placeholder(tf.float32, shape=[None, seq_length], name="targets")
num_lstm_layers = 2
with tf.variable_scope("LSTM") as scope:
lstm_cell = tf.nn.rnn_cell.LSTMCell(512, initializer=initializer, state_is_tuple=True)
self.lstm = tf.nn.rnn_cell.MultiRNNCell([lstm_cell] * num_lstm_layers, state_is_tuple=True)
init_c = # compute initial LSTM memory state using contents in placeholder 'inputs'
init_h = # compute initial LSTM hidden state using contents in placeholder 'inputs'
self.state = [tf.nn.rnn_cell.LSTMStateTuple(init_c, init_h)] * num_lstm_layers
outputs = []
for step in range(seq_length):
if step != 0:
scope.reuse_variables()
# CNN features, as input for LSTM
x_t = # ...
# LSTM step through time
output, self.state = self.lstm(x_t, self.state)
outputs.append(output)
Tensorflow,在 RNN 中保存状态的最佳方式?实际上是我最初的问题。下面的代码是我如何使用状态元组。
with tf.variable_scope('decoder') as scope:
rnn_cell = tf.nn.rnn_cell.MultiRNNCell \
([
tf.nn.rnn_cell.LSTMCell(512, num_proj = 256, state_is_tuple = True),
tf.nn.rnn_cell.LSTMCell(512, num_proj = WORD_VEC_SIZE, state_is_tuple = True)
], state_is_tuple = True)
state = [[tf.zeros((BATCH_SIZE, sz)) for sz in sz_outer] for sz_outer in rnn_cell.state_size]
for t in range(TIME_STEPS):
if t:
last = y_[t - 1] if TRAINING else y[t - 1]
else:
last = tf.zeros((BATCH_SIZE, WORD_VEC_SIZE))
y[t] = tf.concat(1, (y[t], last))
y[t], state = rnn_cell(y[t], state)
scope.reuse_variables()
我没有使用 tf.nn.rnn_cell.LSTMStateTuple
,而是创建了一个工作正常的列表列表。在这个例子中,我没有保存状态。然而,您可以很容易地从变量中创建状态,并只使用赋值来保存值。
我发现将所有层的整个状态保存在一个占位符中是最简单的。
init_state = np.zeros((num_layers, 2, batch_size, state_size))
...
state_placeholder = tf.placeholder(tf.float32, [num_layers, 2, batch_size, state_size])
然后在使用原生tensorflow RNN Api.
之前解压它并创建一个LSTMStateTuples元组l = tf.unpack(state_placeholder, axis=0)
rnn_tuple_state = tuple(
[tf.nn.rnn_cell.LSTMStateTuple(l[idx][0], l[idx][1])
for idx in range(num_layers)]
)
RNN 传入 API:
cell = tf.nn.rnn_cell.LSTMCell(state_size, state_is_tuple=True)
cell = tf.nn.rnn_cell.MultiRNNCell([cell]*num_layers, state_is_tuple=True)
outputs, state = tf.nn.dynamic_rnn(cell, x_input_batch, initial_state=rnn_tuple_state)
state
- 变量将作为占位符提供给下一批。