LSTM 中层数的增加增加了 Pytorch 中的输入维度?

Increasing num of layers in LSTM increases the input dimensions in Pytorch?

这是我的 LSTM 模型,我在训练它时发现了一个特殊问题。

class LSTM1(nn.Module):
def __init__(self, num_classes, input_size, hidden_size, num_layers, seq_length,drop_prob=0.0):
    super(LSTM1, self).__init__()
    self.num_classes = num_classes #number of classes
    self.num_layers = num_layers #number of layers
    self.input_size = input_size #input size
    self.hidden_size = hidden_size #hidden state
    self.seq_length = seq_length #sequence length

    self.lstm = nn.LSTM(input_size=input_size, hidden_size=hidden_size,
                      num_layers=num_layers, dropout=drop_prob,batch_first=True) #lstm
    # self.dropout = nn.Dropout(drop_prob)
    # self.fc_1 =  nn.Linear(hidden_size, num_classes)
    self.fc_1 =  nn.Linear(hidden_size, 64) #fully connected 1
    self.fc = nn.Linear(64, num_classes) #fully connected last layer

    self.relu = nn.ReLU()

def forward(self,x):
    # h_0 = Variable(torch.zeros(self.num_layers, x.size(0), self.hidden_size)).to(device) #hidden state
    # c_0 = Variable(torch.zeros(self.num_layers, x.size(0), self.hidden_size)).to(device) #internal state
    # Propagate input through LSTM
    output, (hn, cn) = self.lstm(x) #lstm with input, hidden, and internal state
    
    hn = hn.view(-1, self.hidden_size) #reshaping the data for Dense layer next
    #out = self.dropout(hn)
    out = self.relu(hn)
    out = self.fc_1(out) #first Dense
    out = self.relu(out) #relu
    out = self.fc(out) #Final Output
    out = self.relu(out) #relu

    return out

我的训练数据有这些维度,其中第一个是输入,第二个是标签

Training Shape torch.Size([8051, 1, 201]) torch.Size([8051, 1])

当我为 LSTM 层使用 num_layers 变量 = 1 进行训练时,效果很好。 但是,当我将 num_layers 增加到 2 时,我得到的错误是

    Training Shape torch.Size([8051, 1, 201]) torch.Size([8051, 1])
Testing Shape torch.Size([4930, 1, 201]) torch.Size([4930, 1])
C:\Users\adity\miniconda3\envs\pytorch3\lib\site-packages\torch\nn\modules\loss.py:528: UserWarning: Using a target size (torch.Size([8051, 1])) that is different to the input size (torch.Size([16102, 1])). This will likely lead to incorrect results due to broadcasting. Please ensure they have the same size.
  return F.mse_loss(input, target, reduction=self.reduction)
Traceback (most recent call last):
  File "C:\Users\adity\OneDrive - Louisiana State University\Documents\CSC 7343 HW\hw1.py", line 135, in <module>
    loss = criterion(outputs.to(device), y_train_tensors.to(device))
  File "C:\Users\adity\miniconda3\envs\pytorch3\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "C:\Users\adity\miniconda3\envs\pytorch3\lib\site-packages\torch\nn\modules\loss.py", line 528, in forward
    return F.mse_loss(input, target, reduction=self.reduction)
  File "C:\Users\adity\miniconda3\envs\pytorch3\lib\site-packages\torch\nn\functional.py", line 3089, in mse_loss
    expanded_input, expanded_target = torch.broadcast_tensors(input, target)
  File "C:\Users\adity\miniconda3\envs\pytorch3\lib\site-packages\torch\functional.py", line 73, in broadcast_tensors
    return _VF.broadcast_tensors(tensors)  # type: ignore[attr-defined]
RuntimeError: The size of tensor a (16102) must match the size of tensor b (8051) at non-singleton dimension 0

当我将图层更改为 3 时,错误提示张量的大小应为 24153。

为什么当我增加 num_layers 时输入尺寸会改变?

问题是您正在展平 hn,根据文档页面,它的形状是 (D*num_layers, N, Hout) 它取决于隐藏层。因此,您要么必须更改后面的完全连接,要么只采用 LSTM 的最后一个隐藏状态。