使用嵌入向 LSTMCell 提供整个句子会产生维度错误

Question

所以目前我正面临文本分类问题，但我什至无法在 Tensorflow 中设置我的模型。我有一批长度为 70 的句子（使用填充），我使用的是嵌入大小为 300 的 embedding_lookup。这里是嵌入代码：

embedding = tf.constant(embedding_matrix, name="embedding")
inputs = tf.nn.embedding_lookup(embedding, input_.input_data)

所以现在输入的形状应该是 [batch_size、sentence_length、embedding_size]，这并不奇怪。现在可悲的是，我的 LSTMCell 得到了一个 ValueError，因为它期望 ndim=2 并且显然输入是 ndim=3。我还没有找到改变 LSTM 层预期输入形状的方法。这是我的 LSTMCell 初始化代码：

for i in range(num_layers):
    cells.append(LSTMCell(num_units, forget_bias, state_is_tuple, reuse=reuse, name='lstm_{}'.format(i))
cell = tf.contrib.rnn.MultiRNNCell(cells, state_is_tuple=True)

错误是在单元格的调用函数中触发的，如下所示：

for step in range(self.num_steps):
    if step > 0: tf.get_variable_scope().reuse_variables()
    (cell_output, state) = cell(inputs[:, step, :], state)

类似问题但没有帮助：

Answer 1

我可以自己解决问题。看起来，与 LSTM 的实际工作方式相关，LSTMCell 实现更加实用和基础。 Keras LSTM 层处理了我在使用 TensorFlow 时需要考虑的事情。我使用的示例来自以下官方 TensorFlow 示例：

https://github.com/tensorflow/models/tree/master/tutorials/rnn/ptb

因为我们想为我们的 LSTM 层提供一个序列，所以我们需要一个接一个地为单元格提供一个单词。由于 Cell 的调用创建了两个输出（单元输出和单元状态），我们对所有句子中的所有单词使用一个循环来喂养单元并重用我们的单元状态。这样我们就可以为我们的图层创建输出，然后我们可以将其用于进一步的操作。代码如下所示：

self._initial_state = cell.zero_state(config.batch_size, data_type())
state = self._initial_state
outputs = []
with tf.variable_scope("RNN"):
  for time_step in range(self.num_steps):
    if time_step > 0: tf.get_variable_scope().reuse_variables()
    (cell_output, state) = cell(inputs[:, time_step, :], state)
    outputs.append(cell_output)
output = tf.reshape(tf.concat(outputs, 1), [-1, config.hidden_size])

num_steps 表示我们将要使用的句子中的单词数量。

使用嵌入向 LSTMCell 提供整个句子会产生维度错误

Feeding LSTMCell with whole sentences using embeddings gives dimensionality error

python

machine-learning

text-classification

lstm

tensorflow