caffe 和 tensorflow/keras 之间无可比拟的权重形状
Incomparable weight shape between caffe and tensorflow / keras
我正在尝试将 caffe 模型转换为 keras,我已经成功地使用了两者 MMdnn and even caffe-tensorflow。我的输出是 .npy
个文件和 .pb
个文件。我对 .pb
文件不太满意,所以我坚持使用包含权重和偏差的 .npy
文件。我重建了一个 mAlexNet 网络如下:
import tensorflow as tf
from tensorflow import keras
from keras.layers import Conv2D, MaxPool2D, Dropout, Dense, Flatten
def define_malexnet():
input = keras.Input(shape=(224, 224, 3), name='data')
x = Conv2D(16, kernel_size=(11,11), strides=(4,4), activation='relu', name='conv1')(input)
x = MaxPool2D(pool_size=(3,3), strides=(2,2), padding='same', name='pool1')(x)
x = Conv2D(20, kernel_size=(5,5), strides=(1,1), activation='relu', name='conv2')(x)
x = MaxPool2D(pool_size=(3,3), strides=(2,2), name='pool2')(x)
x = Conv2D(30, kernel_size=(3,3), strides=(1,1), activation='relu', name='conv3')(x)
x = MaxPool2D(pool_size=(3,3), strides=(2,2), name='pool3')(x)
x = Flatten()(x)
x = Dense(48, activation='relu', name='fc4')(x)
output = Dense(2, activation='softmax', name='fc5')(x)
occupancy_model = keras.Model(input, output, name='occupancy_malexnet')
occupancy_model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
return occupancy_model
然后我尝试使用以下代码片段加载权重:
import numpy as np
weights_data = np.load('weights.npy', allow_pickle=True).item()
model = define_malexnet()
for layer in model.layers:
if layer.name in weights_data.keys():
layer_weights = weights_data[layer.name]
layer.set_weights((layer_weights['weights'], layer_weights['bias']))
在此过程中出现错误:
ValueError: Layer conv1 weight shape (16,) is not compatible with
provided weight shape (1, 1, 1, 16).
据我了解这是因为不同的后端以及它们如何初始化权重,但我还没有找到解决这个问题的方法。我的问题是,如何调整从文件加载的权重以适合我的 keras 模型? Link 到 weights.npy
文件 https://drive.google.com/file/d/1QKzY-WxiUnf9VnlhWQS38DE3uF5I_qTl/view?usp=sharing.
问题是 bias
向量。它的形状是 4D 张量,但 Keras 假定它是 1D 张量。只需展平偏差向量:
import numpy as np
weights_data = np.load('weights.npy', allow_pickle=True).item()
model = define_malexnet()
for layer in model.layers:
if layer.name in weights_data.keys():
layer_weights = weights_data[layer.name]
layer.set_weights((layer_weights['weights'], layer_weights['bias'].flatten()))
作为健全性检查,一旦我创建了您的模型,我将访问 conv1
权重和您缓存的相应权重,然后将它们进行比较:
In [22]: weights1 = model.layers[1].weights[0].numpy()
In [23]: weights2 = weights_data['conv1']['weights']
In [24]: np.allclose(weights1, weights2)
Out[24]: True
偏差相同:
In [25]: bias1 = model.layers[1].weights[1].numpy()
In [26]: bias2 = weights_data['conv1']['bias']
In [27]: np.allclose(bias1, bias2)
Out[27]: True
请注意,我不必消除缓存结果中的偏差,因为 np.allclose
会在内部拉平单一维度。
我正在尝试将 caffe 模型转换为 keras,我已经成功地使用了两者 MMdnn and even caffe-tensorflow。我的输出是 .npy
个文件和 .pb
个文件。我对 .pb
文件不太满意,所以我坚持使用包含权重和偏差的 .npy
文件。我重建了一个 mAlexNet 网络如下:
import tensorflow as tf
from tensorflow import keras
from keras.layers import Conv2D, MaxPool2D, Dropout, Dense, Flatten
def define_malexnet():
input = keras.Input(shape=(224, 224, 3), name='data')
x = Conv2D(16, kernel_size=(11,11), strides=(4,4), activation='relu', name='conv1')(input)
x = MaxPool2D(pool_size=(3,3), strides=(2,2), padding='same', name='pool1')(x)
x = Conv2D(20, kernel_size=(5,5), strides=(1,1), activation='relu', name='conv2')(x)
x = MaxPool2D(pool_size=(3,3), strides=(2,2), name='pool2')(x)
x = Conv2D(30, kernel_size=(3,3), strides=(1,1), activation='relu', name='conv3')(x)
x = MaxPool2D(pool_size=(3,3), strides=(2,2), name='pool3')(x)
x = Flatten()(x)
x = Dense(48, activation='relu', name='fc4')(x)
output = Dense(2, activation='softmax', name='fc5')(x)
occupancy_model = keras.Model(input, output, name='occupancy_malexnet')
occupancy_model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
return occupancy_model
然后我尝试使用以下代码片段加载权重:
import numpy as np
weights_data = np.load('weights.npy', allow_pickle=True).item()
model = define_malexnet()
for layer in model.layers:
if layer.name in weights_data.keys():
layer_weights = weights_data[layer.name]
layer.set_weights((layer_weights['weights'], layer_weights['bias']))
在此过程中出现错误:
ValueError: Layer conv1 weight shape (16,) is not compatible with provided weight shape (1, 1, 1, 16).
据我了解这是因为不同的后端以及它们如何初始化权重,但我还没有找到解决这个问题的方法。我的问题是,如何调整从文件加载的权重以适合我的 keras 模型? Link 到 weights.npy
文件 https://drive.google.com/file/d/1QKzY-WxiUnf9VnlhWQS38DE3uF5I_qTl/view?usp=sharing.
问题是 bias
向量。它的形状是 4D 张量,但 Keras 假定它是 1D 张量。只需展平偏差向量:
import numpy as np
weights_data = np.load('weights.npy', allow_pickle=True).item()
model = define_malexnet()
for layer in model.layers:
if layer.name in weights_data.keys():
layer_weights = weights_data[layer.name]
layer.set_weights((layer_weights['weights'], layer_weights['bias'].flatten()))
作为健全性检查,一旦我创建了您的模型,我将访问 conv1
权重和您缓存的相应权重,然后将它们进行比较:
In [22]: weights1 = model.layers[1].weights[0].numpy()
In [23]: weights2 = weights_data['conv1']['weights']
In [24]: np.allclose(weights1, weights2)
Out[24]: True
偏差相同:
In [25]: bias1 = model.layers[1].weights[1].numpy()
In [26]: bias2 = weights_data['conv1']['bias']
In [27]: np.allclose(bias1, bias2)
Out[27]: True
请注意,我不必消除缓存结果中的偏差,因为 np.allclose
会在内部拉平单一维度。