具有 Tensorflow 后端和 Theano 后端的 Keras 使用相同的模型和相同的输入做出不同的预测
Keras with Tensorflow backend and Theano backend make different predictions with same model and same input
我尝试根据分别保存为 json 字符串和 h5 权重的预训练模型进行预测,但似乎不同的后端(Tensorflow 和 Theano)会给我不同的输出,即使输入和模型是完全一样的。我发现即使在第一层,也就是一维卷积,激活是不同的,这里是打印来自 convolution1D 层的第 5 个过滤器的部分激活的代码:
Theano版本:
from keras.models import model_from_json
import numpy as np
import os
os.environ['KERAS_BACKEND'] = 'theano'
model_file = 'model.h5'
x_file = 'x.csv'
model_json = '{"class_name": "Model", "keras_version": "1.2.2", "config": {"layers": [{"class_name": "InputLayer", "config": {"batch_input_shape": [null, 1002, 6], "input_dtype": "float32", "sparse": false, "name": "input_1"}, "inbound_nodes": [], "name": "input_1"}, {"class_name": "Convolution1D", "config": {"batch_input_shape": [null, null, 6], "W_constraint": null, "b_constraint": null, "name": "convolution1d_1", "activity_regularizer": null, "trainable": true, "filter_length": 34, "init": "glorot_uniform", "bias": true, "nb_filter": 128, "input_dtype": "float32", "subsample_length": 1, "border_mode": "valid", "input_dim": 6, "b_regularizer": null, "W_regularizer": null, "activation": "relu", "input_length": null}, "inbound_nodes": [[["input_1", 0, 0]]], "name": "convolution1d_1"}], "input_layers": [["input_1", 0, 0]], "output_layers": [["convolution1d_1", 0, 0]], "name": "model_1"}}'
model = model_from_json(model_json)
model.load_weights(model_file)
x=np.loadtxt(x_file)
x = np.reshape(x,(1,x.shape[0],x.shape[1]))
y = model.predict(x)
y[0,range(230),4]
输入和输出如下所示:
Theano version
Tensorflow 版本:
from keras.models import model_from_json
import numpy as np
import os
os.environ['KERAS_BACKEND'] = 'tensorflow'
model_file = 'model.h5'
x_file = 'x.csv'
model_json = '{"class_name": "Model", "keras_version": "1.2.2", "config": {"layers": [{"class_name": "InputLayer", "config": {"batch_input_shape": [null, 1002, 6], "input_dtype": "float32", "sparse": false, "name": "input_1"}, "inbound_nodes": [], "name": "input_1"}, {"class_name": "Convolution1D", "config": {"batch_input_shape": [null, null, 6], "W_constraint": null, "b_constraint": null, "name": "convolution1d_1", "activity_regularizer": null, "trainable": true, "filter_length": 34, "init": "glorot_uniform", "bias": true, "nb_filter": 128, "input_dtype": "float32", "subsample_length": 1, "border_mode": "valid", "input_dim": 6, "b_regularizer": null, "W_regularizer": null, "activation": "relu", "input_length": null}, "inbound_nodes": [[["input_1", 0, 0]]], "name": "convolution1d_1"}], "input_layers": [["input_1", 0, 0]], "output_layers": [["convolution1d_1", 0, 0]], "name": "model_1"}}'
model = model_from_json(model_json)
model.load_weights(model_file)
x=np.loadtxt(x_file)
x = np.reshape(x,(1,x.shape[0],x.shape[1]))
y = model.predict(x)
y[0,range(230),4]
输入和输出如下所示:
Tensorflow version
经过几次实验,我发现 Theano 倾向于有 "wrong" 答案,这里是一个计算第 5 个过滤器的第一个 window 的例子(在这个模型中偏差为零,我已经检查过):
l=model.get_layer(index=1)
w1 = l.get_weights()[0]
w2 = l.get_weights()[1]
data1 = w1[:,0,:,4]
data2 = x[0,range(34),:]
ans=0
for i in range(6):
ans += np.sum(np.multiply(data1[:,i],data2[:,i]))
Ans 等于 0.08544020017143339
Tensorflow 给出 0.08544022,与我的计算结果相同,但是 Theano 给出 0.0518605。谁能对此做出解释?
似乎 Theano 中的内核权重在计算过程中被翻转了。下面的代码清楚地显示了 Tensorflow 和 Theano 的区别:
import numpy as np
import os
os.environ['KERAS_BACKEND'] = 'theano'
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation,Convolution1D
from keras.optimizers import SGD
model = Sequential()
model.add(Convolution1D(1, 2, border_mode='valid', input_shape=(6, 1),bias=True,activation='relu'))
l=model.get_layer(index=1)
w1 = l.get_weights()[0]
w2 = l.get_weights()[1]
np.random.seed(0)
x = np.random.random((1,6,1))
y = model.predict(x)
a_tf = np.sum(np.multiply(w1[:,0,0,0],x[0,range(2),0])) # y[0,0] would equal to this with Tensorflow backend
a_th = np.sum(np.multiply(np.flip(w1[:,0,0,0]),x[0,range(2),0])) # y[0,0] would equal to this with Theano backend
我尝试根据分别保存为 json 字符串和 h5 权重的预训练模型进行预测,但似乎不同的后端(Tensorflow 和 Theano)会给我不同的输出,即使输入和模型是完全一样的。我发现即使在第一层,也就是一维卷积,激活是不同的,这里是打印来自 convolution1D 层的第 5 个过滤器的部分激活的代码:
Theano版本:
from keras.models import model_from_json
import numpy as np
import os
os.environ['KERAS_BACKEND'] = 'theano'
model_file = 'model.h5'
x_file = 'x.csv'
model_json = '{"class_name": "Model", "keras_version": "1.2.2", "config": {"layers": [{"class_name": "InputLayer", "config": {"batch_input_shape": [null, 1002, 6], "input_dtype": "float32", "sparse": false, "name": "input_1"}, "inbound_nodes": [], "name": "input_1"}, {"class_name": "Convolution1D", "config": {"batch_input_shape": [null, null, 6], "W_constraint": null, "b_constraint": null, "name": "convolution1d_1", "activity_regularizer": null, "trainable": true, "filter_length": 34, "init": "glorot_uniform", "bias": true, "nb_filter": 128, "input_dtype": "float32", "subsample_length": 1, "border_mode": "valid", "input_dim": 6, "b_regularizer": null, "W_regularizer": null, "activation": "relu", "input_length": null}, "inbound_nodes": [[["input_1", 0, 0]]], "name": "convolution1d_1"}], "input_layers": [["input_1", 0, 0]], "output_layers": [["convolution1d_1", 0, 0]], "name": "model_1"}}'
model = model_from_json(model_json)
model.load_weights(model_file)
x=np.loadtxt(x_file)
x = np.reshape(x,(1,x.shape[0],x.shape[1]))
y = model.predict(x)
y[0,range(230),4]
输入和输出如下所示: Theano version
Tensorflow 版本:
from keras.models import model_from_json
import numpy as np
import os
os.environ['KERAS_BACKEND'] = 'tensorflow'
model_file = 'model.h5'
x_file = 'x.csv'
model_json = '{"class_name": "Model", "keras_version": "1.2.2", "config": {"layers": [{"class_name": "InputLayer", "config": {"batch_input_shape": [null, 1002, 6], "input_dtype": "float32", "sparse": false, "name": "input_1"}, "inbound_nodes": [], "name": "input_1"}, {"class_name": "Convolution1D", "config": {"batch_input_shape": [null, null, 6], "W_constraint": null, "b_constraint": null, "name": "convolution1d_1", "activity_regularizer": null, "trainable": true, "filter_length": 34, "init": "glorot_uniform", "bias": true, "nb_filter": 128, "input_dtype": "float32", "subsample_length": 1, "border_mode": "valid", "input_dim": 6, "b_regularizer": null, "W_regularizer": null, "activation": "relu", "input_length": null}, "inbound_nodes": [[["input_1", 0, 0]]], "name": "convolution1d_1"}], "input_layers": [["input_1", 0, 0]], "output_layers": [["convolution1d_1", 0, 0]], "name": "model_1"}}'
model = model_from_json(model_json)
model.load_weights(model_file)
x=np.loadtxt(x_file)
x = np.reshape(x,(1,x.shape[0],x.shape[1]))
y = model.predict(x)
y[0,range(230),4]
输入和输出如下所示: Tensorflow version
经过几次实验,我发现 Theano 倾向于有 "wrong" 答案,这里是一个计算第 5 个过滤器的第一个 window 的例子(在这个模型中偏差为零,我已经检查过):
l=model.get_layer(index=1)
w1 = l.get_weights()[0]
w2 = l.get_weights()[1]
data1 = w1[:,0,:,4]
data2 = x[0,range(34),:]
ans=0
for i in range(6):
ans += np.sum(np.multiply(data1[:,i],data2[:,i]))
Ans 等于 0.08544020017143339 Tensorflow 给出 0.08544022,与我的计算结果相同,但是 Theano 给出 0.0518605。谁能对此做出解释?
似乎 Theano 中的内核权重在计算过程中被翻转了。下面的代码清楚地显示了 Tensorflow 和 Theano 的区别:
import numpy as np
import os
os.environ['KERAS_BACKEND'] = 'theano'
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation,Convolution1D
from keras.optimizers import SGD
model = Sequential()
model.add(Convolution1D(1, 2, border_mode='valid', input_shape=(6, 1),bias=True,activation='relu'))
l=model.get_layer(index=1)
w1 = l.get_weights()[0]
w2 = l.get_weights()[1]
np.random.seed(0)
x = np.random.random((1,6,1))
y = model.predict(x)
a_tf = np.sum(np.multiply(w1[:,0,0,0],x[0,range(2),0])) # y[0,0] would equal to this with Tensorflow backend
a_th = np.sum(np.multiply(np.flip(w1[:,0,0,0]),x[0,range(2),0])) # y[0,0] would equal to this with Theano backend