卷积网络上的二维矩阵

Question

这可能是个愚蠢的问题，但我想在我的深度强化学习项目中使用卷积神经网络，但我遇到了一个我不明白的问题。在我的项目中，我想插入到网络矩阵 6x7 中，它应该相当于 6x7 大小（42 像素）的黑白图片，对吗？

class CNN(nn.Module):
    def __init__(self):
        super().__init__()
        self.model = torch.nn.Sequential()
        self.model.add_module("conv_1", torch.nn.Conv2d(in_channels=1, out_channels=16, kernel_size=4, stride = 1))
        self.model.add_module("relu_1", torch.nn.ReLU())
        self.model.add_module("max_pool", torch.nn.MaxPool2d(2))
        self.model.add_module("conv_2", torch.nn.Conv2d(in_channels=16, out_channels=16, kernel_size=4, stride = 1))
        self.model.add_module("relu_2", torch.nn.ReLU())
        self.model.add_module("flatten", torch.nn.Flatten())

        self.model.add_module("linear", torch.nn.Linear(in_features=16*16*16, out_features=7))

    def forward(self, x):
        x = self.model(x)
        return x

在conv1 in_channels=1因为我只有1个矩阵（如果是图像识别就意味着1种颜色）。其他 in_channels 和 out_channels 在 linear 之前是随机的。我不知道应该在哪里插入矩阵的大小，但最终输出的大小应该是 7 我在 linear.

中输入的大小

我得到的错误是：

RuntimeError: Expected 3D (unbatched) or 4D (batched) input to conv2d, but got input of size: [6, 7]

Answer 1

您的代码存在一些问题。首先，您收到该错误消息的原因是 CNN 需要一个形状为 (N, Cin, Hin, Win) 的张量，其中：

N是批量大小
Cin为输入通道数
Hin为输入图像像素高度
Win为输入图像像素宽度

您只提供了 width 和 height 维度。您需要显式添加 channels 和 batch 维度，即使这些维度的值仅为 1:

model = CNN()

example_input = torch.randn(size=(6, 7)) # this is your input image

print(example_input.shape) # should be (6, 7)

output = model(example_input) # you original error

example_input = example_input.unsqueeze(0).unsqueeze(0) # adds batch and channels dimension

print(example_input.shape) # should now be (1, 1, 6, 7)

output = model(example_input) # no more error!

但是你会注意到，你现在得到了一个不同的错误：

RuntimeError: Calculated padded input size per channel: (1 x 2). Kernel size: (4 x 4). Kernel size can't be greater than actual input size

这是因为在第一个conv层之后，你的数据是1x2的形状，但是你的第二层的内核大小是4，这使得操作无法进行。大小为 6x7 的输入图像非常小，要么将内核大小减小到可行的大小，要么使用更大的图像。

这是一个工作示例：

import torch
from torch import nn


class CNN(nn.Module):
    def __init__(self):
        super().__init__()
        self.model = torch.nn.Sequential()
        self.model.add_module(
            "conv_1",
            torch.nn.Conv2d(in_channels=1, out_channels=16, kernel_size=2, stride=1),
        )
        self.model.add_module("relu_1", torch.nn.ReLU())
        self.model.add_module("max_pool", torch.nn.MaxPool2d(2))
        self.model.add_module(
            "conv_2",
            torch.nn.Conv2d(in_channels=16, out_channels=16, kernel_size=2, stride=1),
        )
        self.model.add_module("relu_2", torch.nn.ReLU())
        self.model.add_module("flatten", torch.nn.Flatten())

        self.model.add_module("linear", torch.nn.Linear(in_features=32, out_features=7))

    def forward(self, x):
        x = self.model(x)
        return x


model = CNN()
x = torch.randn(size=(6, 7))
x = x.unsqueeze(0).unsqueeze(0)
output = model(x)
print(output.shape) # has shape (1, 7)

注意，我将 kernel_size 更改为 2，最终线性层的输入大小为 32。此外，输出具有形状 (1, 7)，1 是 batch_size，在我们的例子中只有 1。如果您只想要 7 个输出特征，只需使用 x = torch.squeeze(x).

卷积网络上的二维矩阵

Matrix 2D on Convolutional Netowrk

python

machine-learning

conv-neural-network

pytorch