具有 Tensorflow 后端和 Theano 后端的 Keras 使用相同的模型和相同的输入做出不同的预测

Question

我尝试根据分别保存为 json 字符串和 h5 权重的预训练模型进行预测，但似乎不同的后端（Tensorflow 和 Theano）会给我不同的输出，即使输入和模型是完全一样的。我发现即使在第一层，也就是一维卷积，激活是不同的，这里是打印来自 convolution1D 层的第 5 个过滤器的部分激活的代码：

Theano版本：

from keras.models import model_from_json
import numpy as np
import os
os.environ['KERAS_BACKEND'] = 'theano'

model_file = 'model.h5'
x_file = 'x.csv'

model_json = '{"class_name": "Model", "keras_version": "1.2.2", "config": {"layers": [{"class_name": "InputLayer", "config": {"batch_input_shape": [null, 1002, 6], "input_dtype": "float32", "sparse": false, "name": "input_1"}, "inbound_nodes": [], "name": "input_1"}, {"class_name": "Convolution1D", "config": {"batch_input_shape": [null, null, 6], "W_constraint": null, "b_constraint": null, "name": "convolution1d_1", "activity_regularizer": null, "trainable": true, "filter_length": 34, "init": "glorot_uniform", "bias": true, "nb_filter": 128, "input_dtype": "float32", "subsample_length": 1, "border_mode": "valid", "input_dim": 6, "b_regularizer": null, "W_regularizer": null, "activation": "relu", "input_length": null}, "inbound_nodes": [[["input_1", 0, 0]]], "name": "convolution1d_1"}], "input_layers": [["input_1", 0, 0]], "output_layers": [["convolution1d_1", 0, 0]], "name": "model_1"}}'
model = model_from_json(model_json)
model.load_weights(model_file)

x=np.loadtxt(x_file)
x = np.reshape(x,(1,x.shape[0],x.shape[1]))
y = model.predict(x)
y[0,range(230),4]

输入和输出如下所示： Theano version

Tensorflow 版本：

from keras.models import model_from_json
import numpy as np
import os
os.environ['KERAS_BACKEND'] = 'tensorflow'

model_file = 'model.h5'
x_file = 'x.csv'

model_json = '{"class_name": "Model", "keras_version": "1.2.2", "config": {"layers": [{"class_name": "InputLayer", "config": {"batch_input_shape": [null, 1002, 6], "input_dtype": "float32", "sparse": false, "name": "input_1"}, "inbound_nodes": [], "name": "input_1"}, {"class_name": "Convolution1D", "config": {"batch_input_shape": [null, null, 6], "W_constraint": null, "b_constraint": null, "name": "convolution1d_1", "activity_regularizer": null, "trainable": true, "filter_length": 34, "init": "glorot_uniform", "bias": true, "nb_filter": 128, "input_dtype": "float32", "subsample_length": 1, "border_mode": "valid", "input_dim": 6, "b_regularizer": null, "W_regularizer": null, "activation": "relu", "input_length": null}, "inbound_nodes": [[["input_1", 0, 0]]], "name": "convolution1d_1"}], "input_layers": [["input_1", 0, 0]], "output_layers": [["convolution1d_1", 0, 0]], "name": "model_1"}}'
model = model_from_json(model_json)
model.load_weights(model_file)

x=np.loadtxt(x_file)
x = np.reshape(x,(1,x.shape[0],x.shape[1]))
y = model.predict(x)
y[0,range(230),4]

输入和输出如下所示： Tensorflow version

经过几次实验，我发现 Theano 倾向于有 "wrong" 答案，这里是一个计算第 5 个过滤器的第一个 window 的例子（在这个模型中偏差为零，我已经检查过）：

l=model.get_layer(index=1)
w1 = l.get_weights()[0]
w2 = l.get_weights()[1]


data1 = w1[:,0,:,4]
data2 = x[0,range(34),:]
ans=0
for i in range(6):
    ans += np.sum(np.multiply(data1[:,i],data2[:,i]))

Ans 等于 0.08544020017143339 Tensorflow 给出 0.08544022，与我的计算结果相同，但是 Theano 给出 0.0518605。谁能对此做出解释？

Answer 1

似乎 Theano 中的内核权重在计算过程中被翻转了。下面的代码清楚地显示了 Tensorflow 和 Theano 的区别：

import numpy as np
import os
os.environ['KERAS_BACKEND'] = 'theano'

from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation,Convolution1D
from keras.optimizers import SGD

model = Sequential()
model.add(Convolution1D(1, 2, border_mode='valid', input_shape=(6, 1),bias=True,activation='relu'))

l=model.get_layer(index=1)
w1 = l.get_weights()[0]
w2 = l.get_weights()[1]
np.random.seed(0)
x = np.random.random((1,6,1))
y = model.predict(x)


a_tf = np.sum(np.multiply(w1[:,0,0,0],x[0,range(2),0])) # y[0,0] would equal to this with Tensorflow backend
a_th = np.sum(np.multiply(np.flip(w1[:,0,0,0]),x[0,range(2),0])) # y[0,0] would equal to this with Theano backend

具有 Tensorflow 后端和 Theano 后端的 Keras 使用相同的模型和相同的输入做出不同的预测

Keras with Tensorflow backend and Theano backend make different predictions with same model and same input

python

theano

keras

tensorflow