量化 Keras 神经网络模型
Quantize a Keras neural network model
最近,我开始使用 Tensorflow + Keras 创建神经网络,我想尝试 Tensorflow 中可用的量化功能。到目前为止,使用 TF 教程中的示例进行试验效果很好,我有这个基本的工作示例(来自 https://www.tensorflow.org/tutorials/keras/basic_classification):
import tensorflow as tf
from tensorflow import keras
fashion_mnist = keras.datasets.fashion_mnist
(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()
# fashion mnist data labels (indexes related to their respective labelling in the data set)
class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat', 'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']
# preprocess the train and test images
train_images = train_images / 255.0
test_images = test_images / 255.0
# settings variables
input_shape = (train_images.shape[1], train_images.shape[2])
# create the model layers
model = keras.Sequential([
keras.layers.Flatten(input_shape=input_shape),
keras.layers.Dense(128, activation=tf.nn.relu),
keras.layers.Dense(10, activation=tf.nn.softmax)
])
# compile the model with added settings
model.compile(optimizer=tf.train.AdamOptimizer(),
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
# train the model
epochs = 3
model.fit(train_images, train_labels, epochs=epochs)
# evaluate the accuracy of model on test data
test_loss, test_acc = model.evaluate(test_images, test_labels)
print('Test accuracy:', test_acc)
现在,我想在学习和分类过程中使用量化。量化文档(https://www.tensorflow.org/performance/quantization)(该页面自cca 2018年9月15日起不再可用)建议使用这段代码:
loss = tf.losses.get_total_loss()
tf.contrib.quantize.create_training_graph(quant_delay=2000000)
optimizer = tf.train.GradientDescentOptimizer(0.00001)
optimizer.minimize(loss)
但是,它不包含有关此代码应在何处使用或应如何将其连接到 TF 代码的任何信息(甚至没有提及使用 Keras 创建的高级模型)。我不知道这个量化部分与之前创建的神经网络模型有何关系。仅在神经网络代码之后插入它会遇到以下错误:
Traceback (most recent call last):
File "so.py", line 41, in <module>
loss = tf.losses.get_total_loss()
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/losses/util.py", line 112, in get_total_loss
return math_ops.add_n(losses, name=name)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/math_ops.py", line 2119, in add_n
raise ValueError("inputs must be a list of at least one Tensor with the "
ValueError: inputs must be a list of at least one Tensor with the same dtype and shape
是否可以通过这种方式量化 Keras NN 模型,或者我是否遗漏了一些基本的东西?
我想到的一个可能的解决方案是使用低级 TF API 而不是 Keras(需要做很多工作来构建模型),或者可能尝试从 Keras 中提取一些较低级的方法模型。
由于您的网络看起来很简单,您可以使用Tensorflow lite。
Tensorflow lite可用于量化keras模型。
以下代码是为 tensorflow 1.14 编写的。它可能不适用于早期版本。
首先,在训练模型后,您应该将模型保存到 h5
model.fit(train_images, train_labels, epochs=epochs)
# evaluate the accuracy of model on test data
test_loss, test_acc = model.evaluate(test_images, test_labels)
print('Test accuracy:', test_acc)
model.save("model.h5")
要加载 keras 模型,请使用 tf.lite.TFLiteConverter.from_keras_model_file
# load the previously saved model
converter = tf.lite.TFLiteConverter.from_keras_model_file("model.h5")
tflite_model = converter.convert()
# Save the model to file
with open("tflite_model.tflite", "wb") as output_file:
output_file.write(tflite_model)
保存的模型可以加载到python脚本或其他平台和语言。要使用保存的 tflite 模型,tensorlfow.lite 提供 Interpreter. The following example from here 显示如何使用 python 脚本从本地文件加载 tflite 模型。
import numpy as np
import tensorflow as tf
# Load TFLite model and allocate tensors.
interpreter = tf.lite.Interpreter(model_path="tflite_model.tflite")
interpreter.allocate_tensors()
# Get input and output tensors.
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
# Test model on random input data.
input_shape = input_details[0]['shape']
input_data = np.array(np.random.random_sample(input_shape), dtype=np.float32)
interpreter.set_tensor(input_details[0]['index'], input_data)
interpreter.invoke()
# The function `get_tensor()` returns a copy of the tensor data.
# Use `tensor()` in order to get a pointer to the tensor.
output_data = interpreter.get_tensor(output_details[0]['index'])
print(output_data)
在其他答案中提到,TensorFlow Lite 可以帮助您进行网络量化。
TensorFlow Lite provides several levels of support for quantization.
Tensorflow Lite post-training quantization quantizes weights and
activations post training easily. Quantization-aware training allows
for training of networks that can be quantized with minimal accuracy
drop; this is only available for a subset of convolutional neural
network architectures.
所以首先,您需要决定是否需要post-training quantization or quantization-aware training。例如,如果您已经将模型保存为 *.h5 文件,您可能希望按照@Mitiku 的说明进行 post-训练量化。
如果你更喜欢通过在训练中模拟量化的效果(使用你在问题中引用的方法)来获得更高的性能,并且你的模型是在CNN的子集中量化感知训练支持的体系结构,this example 可能会在 Keras 和 TensorFlow 之间的交互方面对您有所帮助。基本上,你只需要在模型定义和它的拟合之间添加这段代码:
sess = tf.keras.backend.get_session()
tf.contrib.quantize.create_training_graph(sess.graph)
sess.run(tf.global_variables_initializer())
最近,我开始使用 Tensorflow + Keras 创建神经网络,我想尝试 Tensorflow 中可用的量化功能。到目前为止,使用 TF 教程中的示例进行试验效果很好,我有这个基本的工作示例(来自 https://www.tensorflow.org/tutorials/keras/basic_classification):
import tensorflow as tf
from tensorflow import keras
fashion_mnist = keras.datasets.fashion_mnist
(train_images, train_labels), (test_images, test_labels) = fashion_mnist.load_data()
# fashion mnist data labels (indexes related to their respective labelling in the data set)
class_names = ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat', 'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']
# preprocess the train and test images
train_images = train_images / 255.0
test_images = test_images / 255.0
# settings variables
input_shape = (train_images.shape[1], train_images.shape[2])
# create the model layers
model = keras.Sequential([
keras.layers.Flatten(input_shape=input_shape),
keras.layers.Dense(128, activation=tf.nn.relu),
keras.layers.Dense(10, activation=tf.nn.softmax)
])
# compile the model with added settings
model.compile(optimizer=tf.train.AdamOptimizer(),
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
# train the model
epochs = 3
model.fit(train_images, train_labels, epochs=epochs)
# evaluate the accuracy of model on test data
test_loss, test_acc = model.evaluate(test_images, test_labels)
print('Test accuracy:', test_acc)
现在,我想在学习和分类过程中使用量化。量化文档(https://www.tensorflow.org/performance/quantization)(该页面自cca 2018年9月15日起不再可用)建议使用这段代码:
loss = tf.losses.get_total_loss()
tf.contrib.quantize.create_training_graph(quant_delay=2000000)
optimizer = tf.train.GradientDescentOptimizer(0.00001)
optimizer.minimize(loss)
但是,它不包含有关此代码应在何处使用或应如何将其连接到 TF 代码的任何信息(甚至没有提及使用 Keras 创建的高级模型)。我不知道这个量化部分与之前创建的神经网络模型有何关系。仅在神经网络代码之后插入它会遇到以下错误:
Traceback (most recent call last):
File "so.py", line 41, in <module>
loss = tf.losses.get_total_loss()
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/losses/util.py", line 112, in get_total_loss
return math_ops.add_n(losses, name=name)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/math_ops.py", line 2119, in add_n
raise ValueError("inputs must be a list of at least one Tensor with the "
ValueError: inputs must be a list of at least one Tensor with the same dtype and shape
是否可以通过这种方式量化 Keras NN 模型,或者我是否遗漏了一些基本的东西? 我想到的一个可能的解决方案是使用低级 TF API 而不是 Keras(需要做很多工作来构建模型),或者可能尝试从 Keras 中提取一些较低级的方法模型。
由于您的网络看起来很简单,您可以使用Tensorflow lite。
Tensorflow lite可用于量化keras模型。
以下代码是为 tensorflow 1.14 编写的。它可能不适用于早期版本。
首先,在训练模型后,您应该将模型保存到 h5
model.fit(train_images, train_labels, epochs=epochs)
# evaluate the accuracy of model on test data
test_loss, test_acc = model.evaluate(test_images, test_labels)
print('Test accuracy:', test_acc)
model.save("model.h5")
要加载 keras 模型,请使用 tf.lite.TFLiteConverter.from_keras_model_file
# load the previously saved model
converter = tf.lite.TFLiteConverter.from_keras_model_file("model.h5")
tflite_model = converter.convert()
# Save the model to file
with open("tflite_model.tflite", "wb") as output_file:
output_file.write(tflite_model)
保存的模型可以加载到python脚本或其他平台和语言。要使用保存的 tflite 模型,tensorlfow.lite 提供 Interpreter. The following example from here 显示如何使用 python 脚本从本地文件加载 tflite 模型。
import numpy as np
import tensorflow as tf
# Load TFLite model and allocate tensors.
interpreter = tf.lite.Interpreter(model_path="tflite_model.tflite")
interpreter.allocate_tensors()
# Get input and output tensors.
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
# Test model on random input data.
input_shape = input_details[0]['shape']
input_data = np.array(np.random.random_sample(input_shape), dtype=np.float32)
interpreter.set_tensor(input_details[0]['index'], input_data)
interpreter.invoke()
# The function `get_tensor()` returns a copy of the tensor data.
# Use `tensor()` in order to get a pointer to the tensor.
output_data = interpreter.get_tensor(output_details[0]['index'])
print(output_data)
在其他答案中提到,TensorFlow Lite 可以帮助您进行网络量化。
TensorFlow Lite provides several levels of support for quantization.
Tensorflow Lite post-training quantization quantizes weights and activations post training easily. Quantization-aware training allows for training of networks that can be quantized with minimal accuracy drop; this is only available for a subset of convolutional neural network architectures.
所以首先,您需要决定是否需要post-training quantization or quantization-aware training。例如,如果您已经将模型保存为 *.h5 文件,您可能希望按照@Mitiku 的说明进行 post-训练量化。
如果你更喜欢通过在训练中模拟量化的效果(使用你在问题中引用的方法)来获得更高的性能,并且你的模型是在CNN的子集中量化感知训练支持的体系结构,this example 可能会在 Keras 和 TensorFlow 之间的交互方面对您有所帮助。基本上,你只需要在模型定义和它的拟合之间添加这段代码:
sess = tf.keras.backend.get_session()
tf.contrib.quantize.create_training_graph(sess.graph)
sess.run(tf.global_variables_initializer())