RNN 不是训练 (PyTorch)
RNN is not training (PyTorch)
我在训练 RNN 时无法理解我做错了什么。我正在尝试训练 RNN 以对序列进行 AND 操作(以了解它如何处理简单任务)。
但是我的网络没有学习,损失保持不变并且它不会过度拟合模型。
你能帮我找到问题吗?
我正在使用的数据:
data = [
[1, 1, 1, 1, 0, 0, 1, 1, 1],
[1, 1, 1, 1],
[0, 0, 1, 1],
[0, 0, 0, 0, 0, 0, 0],
[1, 1, 1, 1, 1, 1, 1],
[1, 1],
[0],
[1],
[1, 0]]
labels = [
0,
1,
0,
0,
1,
1,
0,
1,
0
]
NN 代码:
class AndRNN(nn.Module):
def __init__(self):
super(AndRNN, self).__init__()
self.rnn = nn.RNN(1, 10, 5)
self.fc = nn.Sequential(
nn.Linear(10, 30),
nn.Linear(30, 2)
)
def forward(self, input, hidden):
x, hidden = self.rnn(input, hidden)
x = self.fc(x[-1])
return x, hidden
def initHidden(self):
return Variable(torch.zeros((5, 1, 10)))
训练循环:
criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(net.parameters(), lr=0.001, momentum=0.9)
correct = 0
for e in range(20):
for i in range(len(data)):
tensor = torch.FloatTensor(data[i]).view(-1, 1, 1)
label = torch.LongTensor([labels[i]])
hidden = net.initHidden()
optimizer.zero_grad()
out, hidden = net(Variable(tensor), Variable(hidden.data))
_, l = torch.topk(out, 1)
if label[0] == l[0].data[0]:
correct += 1
loss = criterion(out, Variable(label))
loss.backward()
optimizer.step()
print("Loss ", loss.data[0], "Accuracy ", (correct / (i + 1)))
张量的形状将是 (sequence_len, 1 (batch size), 1),根据 RNN
的 PyTorch 文档,这是正确的
问题出在这一行:
out, hidden = net(Variable(tensor), Variable(hidden.data))
应该是简单的
out, hidden = net(Variable(tensor), hidden)
通过在此处设置 Variable(hidden.data)
,您将在非常步骤中创建一个新的 hidden_state 变量(全为零),而不是从先前的状态传递隐藏状态。
我尝试了您的示例,并将优化器更改为 Adam。有完整的代码。
class AndRNN(nn.Module):
def __init__(self):
super(AndRNN, self).__init__()
self.rnn = nn.RNN(1, 10, 5)
self.fc = nn.Sequential(
nn.Linear(10, 30),
nn.Linear(30, 2)
)
def forward(self, input, hidden):
x, hidden = self.rnn(input, hidden)
x = self.fc(x[-1])
return x, hidden
def initHidden(self):
return Variable(torch.zeros((5, 1, 10)))
net = AndRNN()
criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(net.parameters())
correct = 0
for e in range(100):
for i in range(len(data)):
tensor = torch.FloatTensor(data[i]).view(-1, 1, 1)
label = torch.LongTensor([labels[i]])
hidden = net.initHidden()
optimizer.zero_grad()
out, hidden = net(Variable(tensor), hidden)
loss = criterion(out, Variable(label))
loss.backward()
optimizer.step()
if e % 25 == 0:
print("Loss ", loss.data[0])
结果
Loss 0.6370733976364136
Loss 0.25336754322052
Loss 0.006924811284989119
Loss 0.002351854695007205
我在训练 RNN 时无法理解我做错了什么。我正在尝试训练 RNN 以对序列进行 AND 操作(以了解它如何处理简单任务)。 但是我的网络没有学习,损失保持不变并且它不会过度拟合模型。 你能帮我找到问题吗?
我正在使用的数据:
data = [
[1, 1, 1, 1, 0, 0, 1, 1, 1],
[1, 1, 1, 1],
[0, 0, 1, 1],
[0, 0, 0, 0, 0, 0, 0],
[1, 1, 1, 1, 1, 1, 1],
[1, 1],
[0],
[1],
[1, 0]]
labels = [
0,
1,
0,
0,
1,
1,
0,
1,
0
]
NN 代码:
class AndRNN(nn.Module):
def __init__(self):
super(AndRNN, self).__init__()
self.rnn = nn.RNN(1, 10, 5)
self.fc = nn.Sequential(
nn.Linear(10, 30),
nn.Linear(30, 2)
)
def forward(self, input, hidden):
x, hidden = self.rnn(input, hidden)
x = self.fc(x[-1])
return x, hidden
def initHidden(self):
return Variable(torch.zeros((5, 1, 10)))
训练循环:
criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(net.parameters(), lr=0.001, momentum=0.9)
correct = 0
for e in range(20):
for i in range(len(data)):
tensor = torch.FloatTensor(data[i]).view(-1, 1, 1)
label = torch.LongTensor([labels[i]])
hidden = net.initHidden()
optimizer.zero_grad()
out, hidden = net(Variable(tensor), Variable(hidden.data))
_, l = torch.topk(out, 1)
if label[0] == l[0].data[0]:
correct += 1
loss = criterion(out, Variable(label))
loss.backward()
optimizer.step()
print("Loss ", loss.data[0], "Accuracy ", (correct / (i + 1)))
张量的形状将是 (sequence_len, 1 (batch size), 1),根据 RNN
的 PyTorch 文档,这是正确的问题出在这一行:
out, hidden = net(Variable(tensor), Variable(hidden.data))
应该是简单的
out, hidden = net(Variable(tensor), hidden)
通过在此处设置 Variable(hidden.data)
,您将在非常步骤中创建一个新的 hidden_state 变量(全为零),而不是从先前的状态传递隐藏状态。
我尝试了您的示例,并将优化器更改为 Adam。有完整的代码。
class AndRNN(nn.Module):
def __init__(self):
super(AndRNN, self).__init__()
self.rnn = nn.RNN(1, 10, 5)
self.fc = nn.Sequential(
nn.Linear(10, 30),
nn.Linear(30, 2)
)
def forward(self, input, hidden):
x, hidden = self.rnn(input, hidden)
x = self.fc(x[-1])
return x, hidden
def initHidden(self):
return Variable(torch.zeros((5, 1, 10)))
net = AndRNN()
criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(net.parameters())
correct = 0
for e in range(100):
for i in range(len(data)):
tensor = torch.FloatTensor(data[i]).view(-1, 1, 1)
label = torch.LongTensor([labels[i]])
hidden = net.initHidden()
optimizer.zero_grad()
out, hidden = net(Variable(tensor), hidden)
loss = criterion(out, Variable(label))
loss.backward()
optimizer.step()
if e % 25 == 0:
print("Loss ", loss.data[0])
结果
Loss 0.6370733976364136
Loss 0.25336754322052
Loss 0.006924811284989119
Loss 0.002351854695007205