PyTorch 卷积——为什么是四个维度?

PyTorch Convolution - Why four dimensions?

我正在尝试创建一个如下图所示的 PyTorch 网络(请参阅:this link 获取 arXiv 论文)。该网络旨在学习源代码的特征。基本上,它由一个嵌入查找层、接着是卷积层、最大池化层和一个密集层组成。

我尝试建立这个网络是这样的:

import torch
import torch.nn as nn
import torch.nn.functional as F

class Net(nn.Module):
"""
Class Net.
This network is used to learn source code features in a
supervised manner.
"""

def __init__(self, n_vocab, n=512, k=13, m=10):
    """
    Constructor.

    :parm n_vocab:      size of the vocabulary
    :param n:           number of convolution filters
    :param k:           embedding size
    :param m:           kernel size
    """
    super(Net, self).__init__()

    # embedding layer
    self.emb = nn.Embedding(n_vocab, k)

    # convolution and pooling
    self.conv1 = nn.Conv2d(1, n, (m, k))
    self.pool = nn.AdaptiveMaxPool2d(1)

    # fully connected layers
    self.fc1 = nn.Linear(n, 100)
    self.fc2 = nn.Linear(100, 5)


def forward(self, input):
    """
    Performs a forward pass through the network.

    :param input:       input to network
    :return:            network output
    """
    x = self.emb(torch.LongTensor(input))
    x = x.view(1, 500, 13)
    x = self.pool(F.relu(self.conv1(x)))
    x = F.relu(self.fc1(x))
    x = self.fc2(x)

    return x

我无法让卷积工作。我不断收到错误消息:*** RuntimeError: Expected 4-dimensional input for 4-dimensional weight 512 1 10 13, but got 3-dimensional input of size [1, 500, 13] instead。我的网络的输入由词汇索引组成,这些索引被馈送到嵌入层。输入示例如下所示:

[55, 28, 14, 56, 20, 55, 70, 14, 56, 20, 55, ..., 31, 31, 31, 31, 31, 31, 31]

将此示例输入到网络后,我得到相应的嵌入:

ensor([[[-0.5966, -1.4197,  0.9875,  ..., -0.0211, -2.3168,  0.3744],
        [-0.1759, -1.1841, -0.0564,  ..., -0.0804, -1.1820, -0.1344],
        [ 1.4525,  0.1342, -0.3820,  ..., -0.2679,  0.5997,  1.1058],
        ...,
        [ 1.2074,  0.4087, -0.3353,  ..., -0.1959,  0.5806, -1.4581],
        [ 1.2074,  0.4087, -0.3353,  ..., -0.1959,  0.5806, -1.4581],
        [ 1.2074,  0.4087, -0.3353,  ..., -0.1959,  0.5806, -1.4581]]],
      grad_fn=<ViewBackward>)

输出看起来很适合我。显然,PyTorch 需要 4 个维度来进行卷积,但我只有三个。缺少的维度是什么?

我的训练函数如下所示:

def train(X, y, n_vocab, epochs=5):
"""
Trains the network.

:param X:           network input (indices into vocabulary)
:param y:           gold labels
:param epochs:      number of epochs to train the network
                    (default = 5)
:return:            trained network
"""
# instantiate network model
net = Net(n_vocab)

# define training loss and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)

# train several epochs
for epoch in range(epochs):

    running_loss = 0.0
    for i in range(len(X)):
        X_b, y_b = X[i], y[i]

        # zero the parameter gradients
        optimizer.zero_grad()

        # perform forward pass
        y_pred = net(X_b)
        # compute loss
        loss = criterion(y_pred, y_b)
        # perform backpropagation
        loss.backward()
        # optimize model parameters
        optimizer.step()

        # print statistics
        running_loss += loss.item()
        if i % 2000 == 0:
            print("[%d, %5d] loss: %.3f" %
                (epoch + 1, i + 1, running_loss / 2000))
            running_loss = 0.0

print("Finished training")
return net

如有任何帮助,我们将不胜感激!

提前致谢。

数据的第一个维度应该是batch,在documentation中提到:

Applies a 2D convolution over an input signal composed of several input planes.

... the output value of the layer with input size(N, C, H, W) ...

...

N is a batch size, C denotes a number of channels, H is a height of input planes in pixels, and W is width in pixels.

因此,您应该在将数据传递到网络之前对其进行批处理,或者至少将其重塑为 (1, 1, 500, 13) 的形状以使用 1 的批量大小。