神经网络序列分类中的高损失
High loss in neural network sequence classification
我正在使用神经网络将长度为 340 的序列分类为 8 类,我正在使用交叉熵作为损失。我的损失数字非常高。我想知道我是否在计算每个时期的损失时犯了错误。或者我应该使用其他损失函数。
criterion = nn.CrossEntropyLoss()
if CUDA:
criterion = criterion.cuda()
optimizer = optim.SGD(model.parameters(), lr=LEARNING_RATE, momentum=0.9)
loss_list = []
for epoch in range(N_EPOCHES):
tot_loss=0
running_loss =0
model.train()
loss_values = []
acc_list = []
acc_list = torch.FloatTensor(acc_list)
sum_acc = 0
# Training
for i, (seq_batch, stat_batch) in enumerate(training_generator):
# Transfer to GPU
seq_batch, stat_batch = seq_batch.to(device), stat_batch.to(device)
optimizer.zero_grad()
# Model computation
seq_batch = seq_batch.unsqueeze(-1)
outputs = model(seq_batch)
loss = criterion(outputs.argmax(1), stat_batch.argmax(1))
loss.backward()
optimizer.step()
# print statistics
running_loss += loss.item()*seq_batch.size(0)
loss_values.append(running_loss/len(training_set))
if i % 2000 == 1999: # print every 2000 mini-batches
print('[%d, %5d] loss: %.3f' %
(epoch + 1, i + 1, running_loss / 50000),"acc",(outputs.argmax(1) == stat_batch.argmax(1)).float().mean())
running_loss = 0.0
sum_acc += (outputs.argmax(1) == stat_batch.argmax(1)).float().sum()
print("epoch" , epoch, "acc", sum_acc/len(training_generator))
print('Finished Training')
[1, 2000] loss: 14.205 acc tensor(0.5312, device='cuda:0')
[1, 4000] loss: 13.377 acc tensor(0.4922, device='cuda:0')
[1, 6000] loss: 13.159 acc tensor(0.5508, device='cuda:0')
[1, 8000] loss: 13.050 acc tensor(0.5547, device='cuda:0')
[1, 10000] loss: 12.974 acc tensor(0.4883, device='cuda:0')
epoch 1 acc tensor(133.6352, device='cuda:0')
[2, 2000] loss: 12.833 acc tensor(0.5781, device='cuda:0')
[2, 4000] loss: 12.834 acc tensor(0.5391, device='cuda:0')
[2, 6000] loss: 12.782 acc tensor(0.5195, device='cuda:0')
[2, 8000] loss: 12.774 acc tensor(0.5508, device='cuda:0')
[2, 10000] loss: 12.762 acc tensor(0.5156, device='cuda:0')
epoch 2 acc tensor(139.2496, device='cuda:0')
[3, 2000] loss: 12.636 acc tensor(0.5469, device='cuda:0')
[3, 4000] loss: 12.640 acc tensor(0.5469, device='cuda:0')
[3, 6000] loss: 12.648 acc tensor(0.5508, device='cuda:0')
[3, 8000] loss: 12.637 acc tensor(0.5586, device='cuda:0')
[3, 10000] loss: 12.620 acc tensor(0.6016, device='cuda:0')
epoch 3 acc tensor(140.6962, device='cuda:0')
[4, 2000] loss: 12.520 acc tensor(0.5547, device='cuda:0')
[4, 4000] loss: 12.541 acc tensor(0.5664, device='cuda:0')
[4, 6000] loss: 12.538 acc tensor(0.5430, device='cuda:0')
[4, 8000] loss: 12.535 acc tensor(0.5547, device='cuda:0')
[4, 10000] loss: 12.548 acc tensor(0.5820, device='cuda:0')
epoch 4 acc tensor(141.6522, device='cuda:0')
I am getting very high number for the loss
是什么让您认为这很高?你把它比作什么?
是的,您应该使用 nn.CrossEntropyLoss
进行多 class class 化任务。你的训练损失对我来说似乎很好。在初始化时,你应该有 loss = -log(1/8) = ~2
.
我正在使用神经网络将长度为 340 的序列分类为 8 类,我正在使用交叉熵作为损失。我的损失数字非常高。我想知道我是否在计算每个时期的损失时犯了错误。或者我应该使用其他损失函数。
criterion = nn.CrossEntropyLoss()
if CUDA:
criterion = criterion.cuda()
optimizer = optim.SGD(model.parameters(), lr=LEARNING_RATE, momentum=0.9)
loss_list = []
for epoch in range(N_EPOCHES):
tot_loss=0
running_loss =0
model.train()
loss_values = []
acc_list = []
acc_list = torch.FloatTensor(acc_list)
sum_acc = 0
# Training
for i, (seq_batch, stat_batch) in enumerate(training_generator):
# Transfer to GPU
seq_batch, stat_batch = seq_batch.to(device), stat_batch.to(device)
optimizer.zero_grad()
# Model computation
seq_batch = seq_batch.unsqueeze(-1)
outputs = model(seq_batch)
loss = criterion(outputs.argmax(1), stat_batch.argmax(1))
loss.backward()
optimizer.step()
# print statistics
running_loss += loss.item()*seq_batch.size(0)
loss_values.append(running_loss/len(training_set))
if i % 2000 == 1999: # print every 2000 mini-batches
print('[%d, %5d] loss: %.3f' %
(epoch + 1, i + 1, running_loss / 50000),"acc",(outputs.argmax(1) == stat_batch.argmax(1)).float().mean())
running_loss = 0.0
sum_acc += (outputs.argmax(1) == stat_batch.argmax(1)).float().sum()
print("epoch" , epoch, "acc", sum_acc/len(training_generator))
print('Finished Training')
[1, 2000] loss: 14.205 acc tensor(0.5312, device='cuda:0')
[1, 4000] loss: 13.377 acc tensor(0.4922, device='cuda:0')
[1, 6000] loss: 13.159 acc tensor(0.5508, device='cuda:0')
[1, 8000] loss: 13.050 acc tensor(0.5547, device='cuda:0')
[1, 10000] loss: 12.974 acc tensor(0.4883, device='cuda:0')
epoch 1 acc tensor(133.6352, device='cuda:0')
[2, 2000] loss: 12.833 acc tensor(0.5781, device='cuda:0')
[2, 4000] loss: 12.834 acc tensor(0.5391, device='cuda:0')
[2, 6000] loss: 12.782 acc tensor(0.5195, device='cuda:0')
[2, 8000] loss: 12.774 acc tensor(0.5508, device='cuda:0')
[2, 10000] loss: 12.762 acc tensor(0.5156, device='cuda:0')
epoch 2 acc tensor(139.2496, device='cuda:0')
[3, 2000] loss: 12.636 acc tensor(0.5469, device='cuda:0')
[3, 4000] loss: 12.640 acc tensor(0.5469, device='cuda:0')
[3, 6000] loss: 12.648 acc tensor(0.5508, device='cuda:0')
[3, 8000] loss: 12.637 acc tensor(0.5586, device='cuda:0')
[3, 10000] loss: 12.620 acc tensor(0.6016, device='cuda:0')
epoch 3 acc tensor(140.6962, device='cuda:0')
[4, 2000] loss: 12.520 acc tensor(0.5547, device='cuda:0')
[4, 4000] loss: 12.541 acc tensor(0.5664, device='cuda:0')
[4, 6000] loss: 12.538 acc tensor(0.5430, device='cuda:0')
[4, 8000] loss: 12.535 acc tensor(0.5547, device='cuda:0')
[4, 10000] loss: 12.548 acc tensor(0.5820, device='cuda:0')
epoch 4 acc tensor(141.6522, device='cuda:0')
I am getting very high number for the loss
是什么让您认为这很高?你把它比作什么?
是的,您应该使用 nn.CrossEntropyLoss
进行多 class class 化任务。你的训练损失对我来说似乎很好。在初始化时,你应该有 loss = -log(1/8) = ~2
.