PyTorch 前向传递使用 Theano 训练的权重

Question

我在 Theano 中训练了一个小型 CNN 二进制 classifier。为了获得更简单的代码，我想将经过训练的权重移植到 PyTorch 或 numpy 前向传递以进行预测。原始 Theano 程序的预测令人满意，但 PyTorch 前向传递将所有示例预测为一个 class。

以下是我如何使用 h5py 在 Theano 中保存训练好的权重：

layer0_w = layer0.W.get_value(borrow=True)
layer0_b = layer0.b.get_value(borrow=True)
layer1_w = layer1.W.get_value(borrow=True)
layer1_b = layer1.b.get_value(borrow=True)
layer2_w = layer2.W.get_value(borrow=True)
layer2_b = layer2.b.get_value(borrow=True)
sm_w = layer_softmax.W.get_value(borrow=True)
sm_b = layer_softmax.b.get_value(borrow=True)

h5_l0w = h5py.File('./model/layer0_w.h5', 'w')
h5_l0w.create_dataset('layer0_w', data=layer0_w)
h5_l0b = h5py.File('./model/layer0_b.h5', 'w')
h5_l0b.create_dataset('layer0_b', data=layer0_b)
h5_l1w = h5py.File('./model/layer1_w.h5', 'w')
h5_l1w.create_dataset('layer1_w', data=layer1_w)
h5_l1b = h5py.File('./model/layer1_b.h5', 'w')
h5_l1b.create_dataset('layer1_b', data=layer1_b)
h5_l2w = h5py.File('./model/layer2_w.h5', 'w')
h5_l2w.create_dataset('layer2_w', data=layer2_w)
h5_l2b = h5py.File('./model/layer2_b.h5', 'w')
h5_l2b.create_dataset('layer2_b', data=layer2_b)
h5_smw = h5py.File('./model/softmax_w.h5', 'w')
h5_smw.create_dataset('softmax_w', data=sm_w)
h5_smb = h5py.File('./model/softmax_b.h5', 'w')
h5_smb.create_dataset('softmax_b', data=sm_b)

然后加载权重以使用 Pytorch 和 Numpy 构建前向传递：

import torch
import numpy as np
import torch.nn.functional as F
def model(data):

    conv0_out = F.conv2d(input=np2var(data),
                         weight=np2var(layer0_w),
                         bias=np2var(layer0_b)
                        )
    layer0_out = relu(var2np(conv0_out))

    conv1_out = F.conv2d(input=np2var(layer0_out),
                         weight=np2var(layer1_w),
                         bias=np2var(layer1_b)
                        )
    layer1_out = np.max(relu(var2np(conv1_out)), axis=2)

    dense_out=relu(np.matmul(layer1_out, layer2_w) + layer2_b)

    softmax_out = softmax(np.matmul(dense_out, softmax_w) + softmax_b)

    return softmax_out

def relu(x):
    return x * (x > 0)
def np2var(x):
    return torch.autograd.Variable(torch.from_numpy(x))
def var2np(x):
    return x.data.numpy()
def softmax(x):
    e_x = np.exp(x - np.max(x))
    return e_x / e_x.sum()

conv2d 函数的输入和内核形状对于 Theano 和 PyTorch 是相同的，并且两个框架中的网络结构是相同的。我无法一步一步检测到任何错误。这里会出什么问题？

Answer 1

Theano uses convolutions (by default, filter_flip=True) while PyTorch uses cross-correlation。因此，对于每个卷积层，您需要在 PyTorch 中使用它们之前翻转权重。

您可以使用 Keras 中的 convert_kernel 函数来实现此结果。

PyTorch 前向传递使用 Theano 训练的权重

PyTorch forward pass using weights trained by Theano

python

theano

conv-neural-network

pytorch