如何使用 tf2 为 seq2seq 构建自定义双向编码器?
how to build a custom bidirectional encoder for seq2seq with tf2?
class Encoder(tf.keras.Model):
def __init__(self, vocab_size, embedding_dim, enc_units, batch_sz):
super(Encoder, self).__init__()
self.batch_sz = batch_sz
self.enc_units = enc_units
self.embedding = tf.keras.layers.Embedding(vocab_size, embedding_dim)
self.gru = tf.keras.layers.GRU(self.enc_units,
return_sequences=True,
return_state=True,
recurrent_initializer='glorot_uniform')
self.bigru=tf.keras.layers.Bidirectional(tf.keras.layers.GRU(self.enc_units,
return_sequences=True,
return_state=True, recurrent_initializer='glorot_uniform'))
def call(self, x):
x = self.embedding(x)
# output, state = self.gru(x)
output, state = self.bigru(x)
return output, state
对于上面的代码,当我使用gru层时,它起作用了。但是当我使用 bigru 图层时,我得到了以下错误:
ValueError:在转换后的代码中:
<ipython-input-51-3ba1fe0beb05>:8 train_step_seq2seq *
enc_output, enc_hidden = encoder(inp)
/tensorflow-2.0.0/python3.6/tensorflow_core/python/keras/engine/base_layer.py:847 __call__
outputs = call_fn(cast_inputs, *args, **kwargs)
<ipython-input-53-4f1b00e47a9a>:22 call *
output, state = self.bidir(x)
ValueError: too many values to unpack (expected 2)
所以我现在想知道这里发生了什么?
不是well-documented,而是双向层(不同于单向RNN层)returns一个三元组:
- 前向和后向RNN的连接状态(形状:batch×length×2GRU dimentsion)
- 前向RNN的最终状态(形状:batch×batch dimension)
- 后向RNN的最终状态(形状:batch×batch维度)
class Encoder(tf.keras.Model):
def __init__(self, vocab_size, embedding_dim, enc_units, batch_sz):
super(Encoder, self).__init__()
self.batch_sz = batch_sz
self.enc_units = enc_units
self.embedding = tf.keras.layers.Embedding(vocab_size, embedding_dim)
self.gru = tf.keras.layers.GRU(self.enc_units,
return_sequences=True,
return_state=True,
recurrent_initializer='glorot_uniform')
self.bigru=tf.keras.layers.Bidirectional(tf.keras.layers.GRU(self.enc_units,
return_sequences=True,
return_state=True, recurrent_initializer='glorot_uniform'))
def call(self, x):
x = self.embedding(x)
# output, state = self.gru(x)
output, state = self.bigru(x)
return output, state
对于上面的代码,当我使用gru层时,它起作用了。但是当我使用 bigru 图层时,我得到了以下错误:
ValueError:在转换后的代码中:
<ipython-input-51-3ba1fe0beb05>:8 train_step_seq2seq *
enc_output, enc_hidden = encoder(inp)
/tensorflow-2.0.0/python3.6/tensorflow_core/python/keras/engine/base_layer.py:847 __call__
outputs = call_fn(cast_inputs, *args, **kwargs)
<ipython-input-53-4f1b00e47a9a>:22 call *
output, state = self.bidir(x)
ValueError: too many values to unpack (expected 2)
所以我现在想知道这里发生了什么?
不是well-documented,而是双向层(不同于单向RNN层)returns一个三元组:
- 前向和后向RNN的连接状态(形状:batch×length×2GRU dimentsion)
- 前向RNN的最终状态(形状:batch×batch dimension)
- 后向RNN的最终状态(形状:batch×batch维度)