LSTM 中层数的增加增加了 Pytorch 中的输入维度?
Increasing num of layers in LSTM increases the input dimensions in Pytorch?
这是我的 LSTM 模型,我在训练它时发现了一个特殊问题。
class LSTM1(nn.Module):
def __init__(self, num_classes, input_size, hidden_size, num_layers, seq_length,drop_prob=0.0):
super(LSTM1, self).__init__()
self.num_classes = num_classes #number of classes
self.num_layers = num_layers #number of layers
self.input_size = input_size #input size
self.hidden_size = hidden_size #hidden state
self.seq_length = seq_length #sequence length
self.lstm = nn.LSTM(input_size=input_size, hidden_size=hidden_size,
num_layers=num_layers, dropout=drop_prob,batch_first=True) #lstm
# self.dropout = nn.Dropout(drop_prob)
# self.fc_1 = nn.Linear(hidden_size, num_classes)
self.fc_1 = nn.Linear(hidden_size, 64) #fully connected 1
self.fc = nn.Linear(64, num_classes) #fully connected last layer
self.relu = nn.ReLU()
def forward(self,x):
# h_0 = Variable(torch.zeros(self.num_layers, x.size(0), self.hidden_size)).to(device) #hidden state
# c_0 = Variable(torch.zeros(self.num_layers, x.size(0), self.hidden_size)).to(device) #internal state
# Propagate input through LSTM
output, (hn, cn) = self.lstm(x) #lstm with input, hidden, and internal state
hn = hn.view(-1, self.hidden_size) #reshaping the data for Dense layer next
#out = self.dropout(hn)
out = self.relu(hn)
out = self.fc_1(out) #first Dense
out = self.relu(out) #relu
out = self.fc(out) #Final Output
out = self.relu(out) #relu
return out
我的训练数据有这些维度,其中第一个是输入,第二个是标签
Training Shape torch.Size([8051, 1, 201]) torch.Size([8051, 1])
当我为 LSTM 层使用 num_layers 变量 = 1 进行训练时,效果很好。
但是,当我将 num_layers 增加到 2 时,我得到的错误是
Training Shape torch.Size([8051, 1, 201]) torch.Size([8051, 1])
Testing Shape torch.Size([4930, 1, 201]) torch.Size([4930, 1])
C:\Users\adity\miniconda3\envs\pytorch3\lib\site-packages\torch\nn\modules\loss.py:528: UserWarning: Using a target size (torch.Size([8051, 1])) that is different to the input size (torch.Size([16102, 1])). This will likely lead to incorrect results due to broadcasting. Please ensure they have the same size.
return F.mse_loss(input, target, reduction=self.reduction)
Traceback (most recent call last):
File "C:\Users\adity\OneDrive - Louisiana State University\Documents\CSC 7343 HW\hw1.py", line 135, in <module>
loss = criterion(outputs.to(device), y_train_tensors.to(device))
File "C:\Users\adity\miniconda3\envs\pytorch3\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "C:\Users\adity\miniconda3\envs\pytorch3\lib\site-packages\torch\nn\modules\loss.py", line 528, in forward
return F.mse_loss(input, target, reduction=self.reduction)
File "C:\Users\adity\miniconda3\envs\pytorch3\lib\site-packages\torch\nn\functional.py", line 3089, in mse_loss
expanded_input, expanded_target = torch.broadcast_tensors(input, target)
File "C:\Users\adity\miniconda3\envs\pytorch3\lib\site-packages\torch\functional.py", line 73, in broadcast_tensors
return _VF.broadcast_tensors(tensors) # type: ignore[attr-defined]
RuntimeError: The size of tensor a (16102) must match the size of tensor b (8051) at non-singleton dimension 0
当我将图层更改为 3 时,错误提示张量的大小应为 24153。
为什么当我增加 num_layers 时输入尺寸会改变?
问题是您正在展平 hn
,根据文档页面,它的形状是 (D*num_layers, N, Hout)
,即 它取决于隐藏层。因此,您要么必须更改后面的完全连接,要么只采用 LSTM 的最后一个隐藏状态。
这是我的 LSTM 模型,我在训练它时发现了一个特殊问题。
class LSTM1(nn.Module):
def __init__(self, num_classes, input_size, hidden_size, num_layers, seq_length,drop_prob=0.0):
super(LSTM1, self).__init__()
self.num_classes = num_classes #number of classes
self.num_layers = num_layers #number of layers
self.input_size = input_size #input size
self.hidden_size = hidden_size #hidden state
self.seq_length = seq_length #sequence length
self.lstm = nn.LSTM(input_size=input_size, hidden_size=hidden_size,
num_layers=num_layers, dropout=drop_prob,batch_first=True) #lstm
# self.dropout = nn.Dropout(drop_prob)
# self.fc_1 = nn.Linear(hidden_size, num_classes)
self.fc_1 = nn.Linear(hidden_size, 64) #fully connected 1
self.fc = nn.Linear(64, num_classes) #fully connected last layer
self.relu = nn.ReLU()
def forward(self,x):
# h_0 = Variable(torch.zeros(self.num_layers, x.size(0), self.hidden_size)).to(device) #hidden state
# c_0 = Variable(torch.zeros(self.num_layers, x.size(0), self.hidden_size)).to(device) #internal state
# Propagate input through LSTM
output, (hn, cn) = self.lstm(x) #lstm with input, hidden, and internal state
hn = hn.view(-1, self.hidden_size) #reshaping the data for Dense layer next
#out = self.dropout(hn)
out = self.relu(hn)
out = self.fc_1(out) #first Dense
out = self.relu(out) #relu
out = self.fc(out) #Final Output
out = self.relu(out) #relu
return out
我的训练数据有这些维度,其中第一个是输入,第二个是标签
Training Shape torch.Size([8051, 1, 201]) torch.Size([8051, 1])
当我为 LSTM 层使用 num_layers 变量 = 1 进行训练时,效果很好。 但是,当我将 num_layers 增加到 2 时,我得到的错误是
Training Shape torch.Size([8051, 1, 201]) torch.Size([8051, 1])
Testing Shape torch.Size([4930, 1, 201]) torch.Size([4930, 1])
C:\Users\adity\miniconda3\envs\pytorch3\lib\site-packages\torch\nn\modules\loss.py:528: UserWarning: Using a target size (torch.Size([8051, 1])) that is different to the input size (torch.Size([16102, 1])). This will likely lead to incorrect results due to broadcasting. Please ensure they have the same size.
return F.mse_loss(input, target, reduction=self.reduction)
Traceback (most recent call last):
File "C:\Users\adity\OneDrive - Louisiana State University\Documents\CSC 7343 HW\hw1.py", line 135, in <module>
loss = criterion(outputs.to(device), y_train_tensors.to(device))
File "C:\Users\adity\miniconda3\envs\pytorch3\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "C:\Users\adity\miniconda3\envs\pytorch3\lib\site-packages\torch\nn\modules\loss.py", line 528, in forward
return F.mse_loss(input, target, reduction=self.reduction)
File "C:\Users\adity\miniconda3\envs\pytorch3\lib\site-packages\torch\nn\functional.py", line 3089, in mse_loss
expanded_input, expanded_target = torch.broadcast_tensors(input, target)
File "C:\Users\adity\miniconda3\envs\pytorch3\lib\site-packages\torch\functional.py", line 73, in broadcast_tensors
return _VF.broadcast_tensors(tensors) # type: ignore[attr-defined]
RuntimeError: The size of tensor a (16102) must match the size of tensor b (8051) at non-singleton dimension 0
当我将图层更改为 3 时,错误提示张量的大小应为 24153。
为什么当我增加 num_layers 时输入尺寸会改变?
问题是您正在展平 hn
,根据文档页面,它的形状是 (D*num_layers, N, Hout)
,即 它取决于隐藏层。因此,您要么必须更改后面的完全连接,要么只采用 LSTM 的最后一个隐藏状态。