PyTorch NN 未训练
PyTorch NN not training
我有一个定制的神经网络模型可以工作,我想将它移到 PyTorch 框架中。但是,由于某些错误配置,网络可能无法训练。如果您看到 odd/wrong 或可能是促成原因的内容,请告知。
import torch
from torch import nn, optim
import torch.nn.functional as F
X_train_t = torch.tensor(X_train).float()
X_test_t = torch.tensor(X_test).float()
y_train_t = torch.tensor(y_train).long().reshape(y_train_t.shape[0], 1)
y_test_t = torch.tensor(y_test).long().reshape(y_test_t.shape[0], 1)
class Classifier(nn.Module):
def __init__(self):
super().__init__()
self.fc1 = nn.Linear(22, 10)
self.fc2 = nn.Linear(10, 1)
def forward(self, x):
# make sure input tensor is flattened
x = x.view(x.shape[0], -1)
x = F.relu(self.fc1(x))
x = F.log_softmax(self.fc2(x), dim=1)
return x
model = Classifier()
criterion = nn.BCELoss()
optimizer = optim.SGD(model.parameters(), lr=0.003)
epochs = 2000
steps = 0
train_losses, test_losses = [], []
for e in range(epochs):
# training loss
optimizer.zero_grad()
log_ps = model(X_train_t)
loss = criterion(log_ps, y_train_t.type(torch.float32))
loss.backward()
optimizer.step()
train_loss = loss.item()
# test loss
# Turn off gradients for validation, saves memory and computations
with torch.no_grad():
log_ps = model(X_test_t)
test_loss = criterion(log_ps, y_test_t.to(torch.float32))
ps = torch.exp(log_ps)
train_losses.append(train_loss/len(X_train_t))
test_losses.append(test_loss/len(X_test_t))
if (e % 100 == 0):
print("Epoch: {}/{}.. ".format(e, epochs),
"Training Loss: {:.3f}.. ".format(train_loss/len(X_train_t)),
"Test Loss: {:.3f}.. ".format(test_loss/len(X_test_t)))
没有进行训练:
Epoch: 0/2000.. Training Loss: 0.014.. Test Loss: 0.082..
Epoch: 100/2000.. Training Loss: 0.014.. Test Loss: 0.082..
...
问题的根源在于您对 self.fc2
的输出应用了 softmax 操作。 self.fc2
的输出大小为 1,因此无论输入如何,softmax 的输出都将为 1。阅读更多关于 pytorch 包中 softmax 激活函数的信息 here. I suspect that you wanted to use the Sigmoid 函数将最后一个线性层的输出转换为区间 [0,1],然后应用某种对数函数。
因为无论输入如何,softmax 的结果都是 1 的输出,所以模型没有很好地训练。我无权访问您的数据,所以我无法准确模拟它,但根据我掌握的信息,将 softmax 激活替换为 sigmoid 应该解决这个问题。
一种更好、数值更稳定的方法是使用 BCEWITHLOGITSLOSS 而不是 criterion = nn.BCELoss()
中的标准,并在最后删除激活函数,因为该标准将 sigmoid 与更稳定的数值计算的 BCE 损失。
总而言之,我的建议是将 criterion = nn.BCELoss()
更改为 criterion = nn.BCEWithLogitsLoss()
并按如下方式更改 forawrd 函数:
def forward(self, x):
# make sure input tensor is flattened
x = x.view(x.shape[0], -1)
x = F.relu(self.fc1(x))
x = self.fc2(x)
我有一个定制的神经网络模型可以工作,我想将它移到 PyTorch 框架中。但是,由于某些错误配置,网络可能无法训练。如果您看到 odd/wrong 或可能是促成原因的内容,请告知。
import torch
from torch import nn, optim
import torch.nn.functional as F
X_train_t = torch.tensor(X_train).float()
X_test_t = torch.tensor(X_test).float()
y_train_t = torch.tensor(y_train).long().reshape(y_train_t.shape[0], 1)
y_test_t = torch.tensor(y_test).long().reshape(y_test_t.shape[0], 1)
class Classifier(nn.Module):
def __init__(self):
super().__init__()
self.fc1 = nn.Linear(22, 10)
self.fc2 = nn.Linear(10, 1)
def forward(self, x):
# make sure input tensor is flattened
x = x.view(x.shape[0], -1)
x = F.relu(self.fc1(x))
x = F.log_softmax(self.fc2(x), dim=1)
return x
model = Classifier()
criterion = nn.BCELoss()
optimizer = optim.SGD(model.parameters(), lr=0.003)
epochs = 2000
steps = 0
train_losses, test_losses = [], []
for e in range(epochs):
# training loss
optimizer.zero_grad()
log_ps = model(X_train_t)
loss = criterion(log_ps, y_train_t.type(torch.float32))
loss.backward()
optimizer.step()
train_loss = loss.item()
# test loss
# Turn off gradients for validation, saves memory and computations
with torch.no_grad():
log_ps = model(X_test_t)
test_loss = criterion(log_ps, y_test_t.to(torch.float32))
ps = torch.exp(log_ps)
train_losses.append(train_loss/len(X_train_t))
test_losses.append(test_loss/len(X_test_t))
if (e % 100 == 0):
print("Epoch: {}/{}.. ".format(e, epochs),
"Training Loss: {:.3f}.. ".format(train_loss/len(X_train_t)),
"Test Loss: {:.3f}.. ".format(test_loss/len(X_test_t)))
没有进行训练:
Epoch: 0/2000.. Training Loss: 0.014.. Test Loss: 0.082..
Epoch: 100/2000.. Training Loss: 0.014.. Test Loss: 0.082..
...
问题的根源在于您对 self.fc2
的输出应用了 softmax 操作。 self.fc2
的输出大小为 1,因此无论输入如何,softmax 的输出都将为 1。阅读更多关于 pytorch 包中 softmax 激活函数的信息 here. I suspect that you wanted to use the Sigmoid 函数将最后一个线性层的输出转换为区间 [0,1],然后应用某种对数函数。
因为无论输入如何,softmax 的结果都是 1 的输出,所以模型没有很好地训练。我无权访问您的数据,所以我无法准确模拟它,但根据我掌握的信息,将 softmax 激活替换为 sigmoid 应该解决这个问题。
一种更好、数值更稳定的方法是使用 BCEWITHLOGITSLOSS 而不是 criterion = nn.BCELoss()
中的标准,并在最后删除激活函数,因为该标准将 sigmoid 与更稳定的数值计算的 BCE 损失。
总而言之,我的建议是将 criterion = nn.BCELoss()
更改为 criterion = nn.BCEWithLogitsLoss()
并按如下方式更改 forawrd 函数:
def forward(self, x):
# make sure input tensor is flattened
x = x.view(x.shape[0], -1)
x = F.relu(self.fc1(x))
x = self.fc2(x)