使用 RNN 生成 PyTorch 路径 - 与输入、输出、隐藏和批量大小混淆
PyTorch path generation with RNN - confusion with input, output, hidden and batch sizes
我是 pytorch 的新手,我遵循了一个关于使用 RNN 生成句子的教程,我正在尝试修改它以生成位置序列,但是我在定义正确的模型参数时遇到了问题,例如 input_size, output_size, hidden_dim, batch_size.
背景:
我有 596 个 x,y 位置序列,每个序列看起来像 [[x1,y1],[x2,y2],...,[xn,yn]]。每个序列代表车辆的二维路径。我想训练一个模型,给定一个起点(或部分序列),可以生成这些序列之一。
-我有 padded/truncated 个序列,因此它们的长度都是 50,这意味着每个序列都是一个形状为 [50,2]
的数组
——然后我把这个数据分成了input_seq和target_seq:
input_seq:torch.Size([596, 49, 2]) 的张量。包含所有 596 个序列,每个序列都没有最后位置。
target_seq:torch.Size([596, 49, 2]) 的张量。包含所有 596 个序列,每个序列都没有第一个位置。
型号class:
class Model(nn.Module):
def __init__(self, input_size, output_size, hidden_dim, n_layers):
super(Model, self).__init__()
# Defining some parameters
self.hidden_dim = hidden_dim
self.n_layers = n_layers
#Defining the layers
# RNN Layer
self.rnn = nn.RNN(input_size, hidden_dim, n_layers, batch_first=True)
# Fully connected layer
self.fc = nn.Linear(hidden_dim, output_size)
def forward(self, x):
batch_size = x.size(0)
# Initializing hidden state for first input using method defined below
hidden = self.init_hidden(batch_size)
# Passing in the input and hidden state into the model and obtaining outputs
out, hidden = self.rnn(x, hidden)
# Reshaping the outputs such that it can be fit into the fully connected layer
out = out.contiguous().view(-1, self.hidden_dim)
out = self.fc(out)
return out, hidden
def init_hidden(self, batch_size):
# This method generates the first hidden state of zeros which we'll use in the forward pass
# We'll send the tensor holding the hidden state to the device we specified earlier as well
hidden = torch.zeros(self.n_layers, batch_size, self.hidden_dim)
return hidden
我使用以下参数实例化模型:
input_size 共 2 个([x,y] 位置)
output_size 共 2 个([x,y] 位置)
hidden_dim of 2(一个 [x,y] 位置)(或者在完整序列的长度中应该是 50?)
model = Model(input_size=2, output_size=2, hidden_dim=2, n_layers=1)
n_epochs = 100
lr=0.01
# Define Loss, Optimizer
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=lr)
# Training Run
for epoch in range(1, n_epochs + 1):
optimizer.zero_grad() # Clears existing gradients from previous epoch
output, hidden = model(input_seq)
loss = criterion(output, target_seq.view(-1).long())
loss.backward() # Does backpropagation and calculates gradients
optimizer.step() # Updates the weights accordingly
if epoch%10 == 0:
print('Epoch: {}/{}.............'.format(epoch, n_epochs), end=' ')
print("Loss: {:.4f}".format(loss.item()))
当我 运行 训练循环时,它失败并出现此错误:
ValueError Traceback (most recent call last)
<ipython-input-9-ad1575e0914b> in <module>
3 optimizer.zero_grad() # Clears existing gradients from previous epoch
4 output, hidden = model(input_seq)
----> 5 loss = criterion(output, target_seq.view(-1).long())
6 loss.backward() # Does backpropagation and calculates gradients
7 optimizer.step() # Updates the weights accordingly
...
ValueError: Expected input batch_size (29204) to match target batch_size (58408).
我尝试修改 input_size、output_size、hidden_dim 和 batch_size 并重塑张量,但我尝试得越多,我就越困惑。有人可以指出我做错了什么吗?
此外,由于批次大小在 Model.forward(self,x) 中定义为 x.size(0),这意味着我只有一个批次大小为 596 对吗?拥有多个小批量的正确方法是什么?
output
的大小为 [batch_size * seq_len, 2] = [29204, 2],然后将 target_seq
,其大小为 [batch_size * seq_len * 2] = [58408]。它们没有相同数量的维度,但具有相同数量的总元素,因此第一个维度不相同。
不考虑维度不匹配,nn.CrossEntropyLoss
is a categorical loss function, which means it would only predict a class from the output. You don't have any classes, but you are trying to predict coordinates, which are continuous values. For this you need to use a regression loss function, such as nn.MSELoss
,计算预测坐标和目标坐标之间的平方error/distance。
criterion = nn.MSELoss()
# .flatten() does the same thing as .view(-1) but is more descriptive
loss = criterion(output.flatten(), target_seq.flatten())
由于损失函数和线性层可以对多维输入进行操作,因此可以避免展平,从而消除了因维度展平和恢复而迷路的潜在风险,并且输出更易于理解检查或稍后在培训之外使用。对于线性层,只有输入的最后一个维度需要匹配nn.Linear
的in_features
,在你的例子中是hidden_dim
。
def forward(self, x):
batch_size = x.size(0)
# Initializing hidden state for first input using method defined below
hidden = self.init_hidden(batch_size)
# Passing in the input and hidden state into the model and obtaining outputs
# out size: [batch_size, seq_len, hidden_dim]
out, hidden = self.rnn(x, hidden)
# out size: [batch_size, seq_len, output_size]
out = self.fc(out)
return out, hidden
现在模型的输出与 target_seq
大小相同,您可以直接调用损失函数而无需展平:
loss = criterion(output, target_seq)
hidden_dim of 2 (an [x,y] position) (or should this be 50 as in the length of a full sequence?)
hidden_dim
不是一对 [x, y] 并且与 input_size
和 output_size
完全无关。它定义了 RNN 的隐藏特征的数量,这是它的复杂性,更大的尺寸可能有更多的空间来保留基本信息,但也需要更多的计算。没有完美的隐藏大小,这在很大程度上取决于用例。您可以尝试不同的尺寸,例如100、256 等,看看这是否会改善您的结果。
Furthermore, since batch size is defined as x.size(0) in Model.forward(self,x), this means I only have a single batch of size 596 right? What would be the correct way to have multiple smaller batches?
是的,你只有一个大小为 596 的批次。如果你想使用较小的批次,例如,如果你不能将它们全部放入一个更复杂的模型中,你可以轻松地使用它们的切片,但是它最好使用 PyTorch 的数据实用程序:torch.utils.data.TensorDataset
to get a dataset, where each sequence of the input has a corresponding target, in combination with torch.utils.data.DataLoader
为您创建批次。
from torch.utils.data import DataLoader, TensorDataset
# Match each sequence of the input_seq to the corresponding target_seq.
# e.g. dataset[0] == (input_seq[0], target_seq[0])
dataset = TensorDataset(input_seq, target_seq)
# Randomly shuffle the data and load it in batches of 16
data_loader = DataLoader(dataset, batch_size=16, shuffle=True)
# Process one batch at a time
for input, target in data_loader:
output, hidden = model(input)
loss = criterion(output, target)
我是 pytorch 的新手,我遵循了一个关于使用 RNN 生成句子的教程,我正在尝试修改它以生成位置序列,但是我在定义正确的模型参数时遇到了问题,例如 input_size, output_size, hidden_dim, batch_size.
背景: 我有 596 个 x,y 位置序列,每个序列看起来像 [[x1,y1],[x2,y2],...,[xn,yn]]。每个序列代表车辆的二维路径。我想训练一个模型,给定一个起点(或部分序列),可以生成这些序列之一。
-我有 padded/truncated 个序列,因此它们的长度都是 50,这意味着每个序列都是一个形状为 [50,2]
的数组——然后我把这个数据分成了input_seq和target_seq:
input_seq:torch.Size([596, 49, 2]) 的张量。包含所有 596 个序列,每个序列都没有最后位置。
target_seq:torch.Size([596, 49, 2]) 的张量。包含所有 596 个序列,每个序列都没有第一个位置。
型号class:
class Model(nn.Module):
def __init__(self, input_size, output_size, hidden_dim, n_layers):
super(Model, self).__init__()
# Defining some parameters
self.hidden_dim = hidden_dim
self.n_layers = n_layers
#Defining the layers
# RNN Layer
self.rnn = nn.RNN(input_size, hidden_dim, n_layers, batch_first=True)
# Fully connected layer
self.fc = nn.Linear(hidden_dim, output_size)
def forward(self, x):
batch_size = x.size(0)
# Initializing hidden state for first input using method defined below
hidden = self.init_hidden(batch_size)
# Passing in the input and hidden state into the model and obtaining outputs
out, hidden = self.rnn(x, hidden)
# Reshaping the outputs such that it can be fit into the fully connected layer
out = out.contiguous().view(-1, self.hidden_dim)
out = self.fc(out)
return out, hidden
def init_hidden(self, batch_size):
# This method generates the first hidden state of zeros which we'll use in the forward pass
# We'll send the tensor holding the hidden state to the device we specified earlier as well
hidden = torch.zeros(self.n_layers, batch_size, self.hidden_dim)
return hidden
我使用以下参数实例化模型:
input_size 共 2 个([x,y] 位置)
output_size 共 2 个([x,y] 位置)
hidden_dim of 2(一个 [x,y] 位置)(或者在完整序列的长度中应该是 50?)
model = Model(input_size=2, output_size=2, hidden_dim=2, n_layers=1)
n_epochs = 100
lr=0.01
# Define Loss, Optimizer
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=lr)
# Training Run
for epoch in range(1, n_epochs + 1):
optimizer.zero_grad() # Clears existing gradients from previous epoch
output, hidden = model(input_seq)
loss = criterion(output, target_seq.view(-1).long())
loss.backward() # Does backpropagation and calculates gradients
optimizer.step() # Updates the weights accordingly
if epoch%10 == 0:
print('Epoch: {}/{}.............'.format(epoch, n_epochs), end=' ')
print("Loss: {:.4f}".format(loss.item()))
当我 运行 训练循环时,它失败并出现此错误:
ValueError Traceback (most recent call last)
<ipython-input-9-ad1575e0914b> in <module>
3 optimizer.zero_grad() # Clears existing gradients from previous epoch
4 output, hidden = model(input_seq)
----> 5 loss = criterion(output, target_seq.view(-1).long())
6 loss.backward() # Does backpropagation and calculates gradients
7 optimizer.step() # Updates the weights accordingly
...
ValueError: Expected input batch_size (29204) to match target batch_size (58408).
我尝试修改 input_size、output_size、hidden_dim 和 batch_size 并重塑张量,但我尝试得越多,我就越困惑。有人可以指出我做错了什么吗?
此外,由于批次大小在 Model.forward(self,x) 中定义为 x.size(0),这意味着我只有一个批次大小为 596 对吗?拥有多个小批量的正确方法是什么?
output
的大小为 [batch_size * seq_len, 2] = [29204, 2],然后将 target_seq
,其大小为 [batch_size * seq_len * 2] = [58408]。它们没有相同数量的维度,但具有相同数量的总元素,因此第一个维度不相同。
不考虑维度不匹配,nn.CrossEntropyLoss
is a categorical loss function, which means it would only predict a class from the output. You don't have any classes, but you are trying to predict coordinates, which are continuous values. For this you need to use a regression loss function, such as nn.MSELoss
,计算预测坐标和目标坐标之间的平方error/distance。
criterion = nn.MSELoss()
# .flatten() does the same thing as .view(-1) but is more descriptive
loss = criterion(output.flatten(), target_seq.flatten())
由于损失函数和线性层可以对多维输入进行操作,因此可以避免展平,从而消除了因维度展平和恢复而迷路的潜在风险,并且输出更易于理解检查或稍后在培训之外使用。对于线性层,只有输入的最后一个维度需要匹配nn.Linear
的in_features
,在你的例子中是hidden_dim
。
def forward(self, x):
batch_size = x.size(0)
# Initializing hidden state for first input using method defined below
hidden = self.init_hidden(batch_size)
# Passing in the input and hidden state into the model and obtaining outputs
# out size: [batch_size, seq_len, hidden_dim]
out, hidden = self.rnn(x, hidden)
# out size: [batch_size, seq_len, output_size]
out = self.fc(out)
return out, hidden
现在模型的输出与 target_seq
大小相同,您可以直接调用损失函数而无需展平:
loss = criterion(output, target_seq)
hidden_dim of 2 (an [x,y] position) (or should this be 50 as in the length of a full sequence?)
hidden_dim
不是一对 [x, y] 并且与 input_size
和 output_size
完全无关。它定义了 RNN 的隐藏特征的数量,这是它的复杂性,更大的尺寸可能有更多的空间来保留基本信息,但也需要更多的计算。没有完美的隐藏大小,这在很大程度上取决于用例。您可以尝试不同的尺寸,例如100、256 等,看看这是否会改善您的结果。
Furthermore, since batch size is defined as x.size(0) in Model.forward(self,x), this means I only have a single batch of size 596 right? What would be the correct way to have multiple smaller batches?
是的,你只有一个大小为 596 的批次。如果你想使用较小的批次,例如,如果你不能将它们全部放入一个更复杂的模型中,你可以轻松地使用它们的切片,但是它最好使用 PyTorch 的数据实用程序:torch.utils.data.TensorDataset
to get a dataset, where each sequence of the input has a corresponding target, in combination with torch.utils.data.DataLoader
为您创建批次。
from torch.utils.data import DataLoader, TensorDataset
# Match each sequence of the input_seq to the corresponding target_seq.
# e.g. dataset[0] == (input_seq[0], target_seq[0])
dataset = TensorDataset(input_seq, target_seq)
# Randomly shuffle the data and load it in batches of 16
data_loader = DataLoader(dataset, batch_size=16, shuffle=True)
# Process one batch at a time
for input, target in data_loader:
output, hidden = model(input)
loss = criterion(output, target)