如何使用 tf2 为 seq2seq 构建自定义双向编码器?

how to build a custom bidirectional encoder for seq2seq with tf2?

class Encoder(tf.keras.Model):
  def __init__(self, vocab_size, embedding_dim, enc_units, batch_sz):
    super(Encoder, self).__init__()
    self.batch_sz = batch_sz
    self.enc_units = enc_units

    self.embedding = tf.keras.layers.Embedding(vocab_size, embedding_dim)
    self.gru = tf.keras.layers.GRU(self.enc_units,
                                   return_sequences=True,
                                   return_state=True,
                                   recurrent_initializer='glorot_uniform')

    self.bigru=tf.keras.layers.Bidirectional(tf.keras.layers.GRU(self.enc_units,
                                                                 return_sequences=True,
                                                                 return_state=True, recurrent_initializer='glorot_uniform'))

  def call(self, x):
    x = self.embedding(x)
    # output, state = self.gru(x)
    output, state = self.bigru(x)

    return output, state

对于上面的代码,当我使用gru层时,它起作用了。但是当我使用 bigru 图层时,我得到了以下错误:

ValueError:在转换后的代码中:

<ipython-input-51-3ba1fe0beb05>:8 train_step_seq2seq  *
    enc_output, enc_hidden = encoder(inp)
/tensorflow-2.0.0/python3.6/tensorflow_core/python/keras/engine/base_layer.py:847 __call__
    outputs = call_fn(cast_inputs, *args, **kwargs)
<ipython-input-53-4f1b00e47a9a>:22 call  *
    output, state = self.bidir(x)

ValueError: too many values to unpack (expected 2)

所以我现在想知道这里发生了什么?

不是well-documented,而是双向层(不同于单向RNN层)returns一个三元组:

  • 前向和后向RNN的连接状态(形状:batch×length×2GRU dimentsion)
  • 前向RNN的最终状态(形状:batch×batch dimension
  • 后向RNN的最终状态(形状:batch×batch维度