使用 Metal Performance Shader 转换 MPSNNImageNode

Question

我目前正在使用 MPS 在 iOS (Swift4) 上复制 YOLOv2（not tiny）。

一个问题是我很难实现 space_to_depth 函数 (https://www.tensorflow.org/api_docs/python/tf/space_to_depth) 和连接两个卷积结果 (13x13x256 + 13x13x1024 -> 13x13x1280)。你能给我一些制作这些零件的建议吗？我的代码如下。

...



let conv19 = MPSCNNConvolutionNode(source: conv18.resultImage,

                                 weights: DataSource("conv19", 3, 3, 1024, 1024))



let conv20 = MPSCNNConvolutionNode(source: conv19.resultImage,

                                 weights: DataSource("conv20", 3, 3, 1024, 1024))



let conv21 = MPSCNNConvolutionNode(source: conv13.resultImage,

                                 weights: DataSource("conv21", 1, 1, 512, 64))



/*****

    1. space_to_depth with conv21

    2. concatenate the result of conv20(13x13x1024) to the result of 1 (13x13x256)

    I need your help to implement this part!

******/

Answer 1

我相信space_to_depth可以用卷积的形式表示：例如，对于维度为 [1,2,2,1] 的输入，使用 4 个卷积核，每个卷积核将一个数字输出到一个通道，即。 [[1,0],[0,0]] [[0,1],[0,0]] [[0,0],[1,0]] [[0,0],[0,1]]，这应该将所有输入数字从空间维度放到深度维度。
MPS其实有一个concat节点。看这里：https://developer.apple.com/documentation/metalperformanceshaders/mpsnnconcatenationnode

你可以这样使用它： concatNode = [[MPSNNConcatenationNode alloc] initWithSources:@[layerA.resultImage, layerB.resultImage]];

Answer 2

如果您使用高级接口和 MPSNNGraph，您应该只使用 MPSNNConcatenationNode，如上面 Tianyu Liu 所述。

如果您正在使用低级接口，在您自己周围手动处理 MPSKernel，则可以通过以下方式完成：

创建一个 1280 通道目标图像来保存结果
运行第一个过滤器正常生成结果的前 256 个通道
运行生成剩余通道的第二个过滤器，destinationFeatureChannelOffset 设置为 256。

在所有情况下这应该足够了，除非数据不是 MPSKernel 的产品。在这种情况下，您需要自己复制它或使用类似线性神经元 (a=1,b=0) 的东西来完成它。

使用 Metal Performance Shader 转换 MPSNNImageNode

Transforming MPSNNImageNode using Metal Performance Shader

neural-network

ios

metal-performance-shaders

swift4

yolo