为什么过滤器和特征层具有相同数量的通道？

Why do filters and feature layers have the same number of channels?

一些对象检测框架，例如 SSD（单次多框检测器）和 Faster-RCNN，具有用于分类和回归的“卷积过滤器”。以下来自SSD：

For a feature layer of size m × n with p channels, the basic element for predicting parameters of a potential detection is a 3 × 3 × p small kernel that produces either a score for a category, or a shape offset relative to the default box coordinates. At each of the m × n locations where the kernel is applied, it produces an output value.

我的问题是：“小核”的个数一定要p吗？如何设置一个任意数字 k（与特征通道不同）？

在图中，extra Feature layers 部分显示了 small kernel 如何从每个输出位置提取 p 向量，预测不同 aspect ratios 和 class categories。

例如，从第一个卷积特征图中，p是(3x(classes+4))，而对于第二个是(6x(classes+4))。数字 3 和 6 表示为这些特征图定义的 anchor 框的数量，并且对于这些锚框中的每一个都有 classes + 4 box coordinates 输出。

所以你需要固定p根据你为每个feature map决定的anchor boxes的数量，你要检测的类个数

My question is: does the numbers of “small kernels” have to be p? How about set a arbitrary number k (which is not same with feature channels)?

特征通道是 3x3xp 通道卷积的结果，因此它始终采用大小 p，即内核的输出通道大小。注意 3x3xp 实际上是 3 x 3 x in_channels x p，例如第一个特征层是通过将 VGG 的 38x38x512 与内核 3x3x512xp 卷积得到 38x38xp

为什么过滤器和特征层具有相同数量的通道？

Why do filters and feature layers have the same number of channels?

object-detection

detection

computer-vision

deep-learning

conv-neural-network