Pytorch 大小不一致 pad_packed_sequence, seq2seq
Pytorch inconsistent size with pad_packed_sequence, seq2seq
我从这个 github 获得的编码器输出有些不一致。
编码器如下所示:
class Encoder(nn.Module):
r"""Applies a multi-layer LSTM to an variable length input sequence.
"""
def __init__(self, input_size, hidden_size, num_layers,
dropout=0.0, bidirectional=True, rnn_type='lstm'):
super(Encoder, self).__init__()
self.input_size = 40
self.hidden_size = 512
self.num_layers = 8
self.bidirectional = True
self.rnn_type = 'lstm'
self.dropout = 0.0
if self.rnn_type == 'lstm':
self.rnn = nn.LSTM(input_size, hidden_size, num_layers,
batch_first=True,
dropout=dropout,
bidirectional=bidirectional)
def forward(self, padded_input, input_lengths):
"""
Args:
padded_input: N x T x D
input_lengths: N
Returns: output, hidden
- **output**: N x T x H
- **hidden**: (num_layers * num_directions) x N x H
"""
total_length = padded_input.size(1) # get the max sequence length
packed_input = pack_padded_sequence(padded_input, input_lengths,
batch_first=True,enforce_sorted=False)
packed_output, hidden = self.rnn(packed_input)
pdb.set_trace()
output, _ = pad_packed_sequence(packed_output, batch_first=True, total_length=total_length)
return output, hidden
所以它只包含一个 rnn lstm 单元,如果我打印编码器,这就是输出:
LSTM(40, 512, num_layers=8, batch_first=True, bidirectional=True)
所以它应该有 512 大小的输出,对吗?但是当我输入一个大小为 torch.Size([16, 1025, 40])
的张量时,16 个大小为 40 的 1025 个向量样本(打包以适合 RNN)时,我从 RNN 获得的输出具有新的编码大小 1024 torch.Size([16, 1025, 1024])
什么时候应该编码为 512 对?
我有什么遗漏吗?
设置bidirectional=True
使LSTM双向,这意味着将有两个LSTM,一个从左到右,另一个从右到左。
来自nn.LSTM
documentation - Outputs:
output of shape (seq_len, batch, num_directions * hidden_size): tensor containing the output features (h_t) from the last layer of the LSTM, for each t. If a torch.nn.utils.rnn.PackedSequence
has been given as the input, the output will also be a packed sequence.
For the unpacked case, the directions can be separated using output.view(seq_len, batch, num_directions, hidden_size)
, with forward and backward being direction 0 and 1 respectively. Similarly, the directions can be separated in the packed case.
由于使用双向 LSTM,您的输出大小为 [batch, seq_len, 2 * hidden_size]
(由于设置 batch_first=True
,batch
和 seq_len
在您的案例中被交换)。将两者的输出连接起来以获得两者的信息,如果您想以不同的方式对待它们,可以轻松地将它们分开。
我从这个 github 获得的编码器输出有些不一致。
编码器如下所示:
class Encoder(nn.Module):
r"""Applies a multi-layer LSTM to an variable length input sequence.
"""
def __init__(self, input_size, hidden_size, num_layers,
dropout=0.0, bidirectional=True, rnn_type='lstm'):
super(Encoder, self).__init__()
self.input_size = 40
self.hidden_size = 512
self.num_layers = 8
self.bidirectional = True
self.rnn_type = 'lstm'
self.dropout = 0.0
if self.rnn_type == 'lstm':
self.rnn = nn.LSTM(input_size, hidden_size, num_layers,
batch_first=True,
dropout=dropout,
bidirectional=bidirectional)
def forward(self, padded_input, input_lengths):
"""
Args:
padded_input: N x T x D
input_lengths: N
Returns: output, hidden
- **output**: N x T x H
- **hidden**: (num_layers * num_directions) x N x H
"""
total_length = padded_input.size(1) # get the max sequence length
packed_input = pack_padded_sequence(padded_input, input_lengths,
batch_first=True,enforce_sorted=False)
packed_output, hidden = self.rnn(packed_input)
pdb.set_trace()
output, _ = pad_packed_sequence(packed_output, batch_first=True, total_length=total_length)
return output, hidden
所以它只包含一个 rnn lstm 单元,如果我打印编码器,这就是输出:
LSTM(40, 512, num_layers=8, batch_first=True, bidirectional=True)
所以它应该有 512 大小的输出,对吗?但是当我输入一个大小为 torch.Size([16, 1025, 40])
的张量时,16 个大小为 40 的 1025 个向量样本(打包以适合 RNN)时,我从 RNN 获得的输出具有新的编码大小 1024 torch.Size([16, 1025, 1024])
什么时候应该编码为 512 对?
我有什么遗漏吗?
设置bidirectional=True
使LSTM双向,这意味着将有两个LSTM,一个从左到右,另一个从右到左。
来自nn.LSTM
documentation - Outputs:
output of shape (seq_len, batch, num_directions * hidden_size): tensor containing the output features (h_t) from the last layer of the LSTM, for each t. If a
torch.nn.utils.rnn.PackedSequence
has been given as the input, the output will also be a packed sequence.For the unpacked case, the directions can be separated using
output.view(seq_len, batch, num_directions, hidden_size)
, with forward and backward being direction 0 and 1 respectively. Similarly, the directions can be separated in the packed case.
由于使用双向 LSTM,您的输出大小为 [batch, seq_len, 2 * hidden_size]
(由于设置 batch_first=True
,batch
和 seq_len
在您的案例中被交换)。将两者的输出连接起来以获得两者的信息,如果您想以不同的方式对待它们,可以轻松地将它们分开。