无法计算出 运行 神经网络的密集层维度
Cannot figure out dense layers dimensions to run the neural network
我正在尝试构建多层神经网络。我有形状的火车数据:
train[0][0].shape
(4096,)
下面是我的密集层
from collections import OrderedDict
n_out = 8
net = nn.Sequential(OrderedDict([
('hidden_linear', nn.Linear(4096, 1366)),
('hidden_activation', nn.Tanh()),
('hidden_linear', nn.Linear(1366, 456)),
('hidden_activation', nn.Tanh()),
('hidden_linear', nn.Linear(456, 100)),
('hidden_activation', nn.Tanh()),
('output_linear', nn.Linear(100, n_out))
]))
我使用交叉熵作为损失函数。我遇到的问题是当我使用以下代码训练模型时:
learning_rate = 0.001
optimizer = torch.optim.SGD(net.parameters(), lr=learning_rate)
n_epochs = 40
for epoch in range(n_epochs):
for snds, labels in final_train_loader:
outputs = net(snds.view(snds.shape[0], -1))
loss = loss_fn(outputs, labels)
optimizer.zero_grad()
loss.backward()
optimizer.step()
print("Epoch: %d, Loss: %f" % (epoch, float(loss)))
我收到的错误是矩阵乘法错误。
RuntimeError: mat1 and mat2 shapes cannot be multiplied (100x4096 and 456x100)
我的尺寸有误,但不知道如何正确。
OrderedDict
包含三个 Linear
层与相同的键关联,hidden_layer
(nn.Tanh
也是如此)。为了使其工作,您需要为这些层提供不同的名称:
inp = torch.rand(100, 4096)
net = nn.Sequential(OrderedDict([
('hidden_linear0', nn.Linear(4096, 1366)),
('hidden_activation0', nn.Tanh()),
('hidden_linear1', nn.Linear(1366, 456)),
('hidden_activation1', nn.Tanh()),
('hidden_linear2', nn.Linear(456, 100)),
('hidden_activation2', nn.Tanh()),
('output_linear', nn.Linear(100, n_out))
]))
net(inp) # now it works!
我正在尝试构建多层神经网络。我有形状的火车数据:
train[0][0].shape
(4096,)
下面是我的密集层
from collections import OrderedDict
n_out = 8
net = nn.Sequential(OrderedDict([
('hidden_linear', nn.Linear(4096, 1366)),
('hidden_activation', nn.Tanh()),
('hidden_linear', nn.Linear(1366, 456)),
('hidden_activation', nn.Tanh()),
('hidden_linear', nn.Linear(456, 100)),
('hidden_activation', nn.Tanh()),
('output_linear', nn.Linear(100, n_out))
]))
我使用交叉熵作为损失函数。我遇到的问题是当我使用以下代码训练模型时:
learning_rate = 0.001
optimizer = torch.optim.SGD(net.parameters(), lr=learning_rate)
n_epochs = 40
for epoch in range(n_epochs):
for snds, labels in final_train_loader:
outputs = net(snds.view(snds.shape[0], -1))
loss = loss_fn(outputs, labels)
optimizer.zero_grad()
loss.backward()
optimizer.step()
print("Epoch: %d, Loss: %f" % (epoch, float(loss)))
我收到的错误是矩阵乘法错误。
RuntimeError: mat1 and mat2 shapes cannot be multiplied (100x4096 and 456x100)
我的尺寸有误,但不知道如何正确。
OrderedDict
包含三个 Linear
层与相同的键关联,hidden_layer
(nn.Tanh
也是如此)。为了使其工作,您需要为这些层提供不同的名称:
inp = torch.rand(100, 4096)
net = nn.Sequential(OrderedDict([
('hidden_linear0', nn.Linear(4096, 1366)),
('hidden_activation0', nn.Tanh()),
('hidden_linear1', nn.Linear(1366, 456)),
('hidden_activation1', nn.Tanh()),
('hidden_linear2', nn.Linear(456, 100)),
('hidden_activation2', nn.Tanh()),
('output_linear', nn.Linear(100, n_out))
]))
net(inp) # now it works!