Bazel 错误解析 tf.estimator 模型
Bazel error parsing tf.estimator model
我正在尝试使用 tf.estimator
和 export_savedmodel()
制作一个 *.pb 模型,它是一个简单的分类器,用于对鸢尾花数据集进行分类(4 个特征,3 个 类):
import tensorflow as tf
num_epoch = 500
num_train = 120
num_test = 30
# 1 Define input function
def input_function(x, y, is_train):
dict_x = {
"thisisinput" : x,
}
dataset = tf.data.Dataset.from_tensor_slices((
dict_x, y
))
if is_train:
dataset = dataset.shuffle(num_train).repeat(num_epoch).batch(num_train)
else:
dataset = dataset.batch(num_test)
return dataset
def my_serving_input_fn():
input_data = tf.placeholder(tf.string, [None], name='input_tensors')
receiver_tensors = {"inputs" : input_data}
# 2 Define feature columns
feature_columns = [
tf.feature_column.numeric_column(key="thisisinput", shape=4),]
features = tf.parse_example(
input_data,
tf.feature_column.make_parse_example_spec(feature_columns))
return tf.estimator.export.ServingInputReceiver(features, receiver_tensors)
def main(argv):
tf.set_random_seed(1103) # avoiding different result of random
# 2 Define feature columns
feature_columns = [
tf.feature_column.numeric_column(key="thisisinput", shape=4),]
# 3 Define an estimator
classifier = tf.estimator.DNNClassifier(
feature_columns=feature_columns,
hidden_units=[10],
n_classes=3,
optimizer=tf.train.GradientDescentOptimizer(0.001),
activation_fn=tf.nn.relu,
model_dir = 'modeliris2/'
)
# Train the model
classifier.train(
input_fn=lambda:input_function(xtrain, ytrain, True)
)
# Evaluate the model
eval_result = classifier.evaluate(
input_fn=lambda:input_function(xtest, ytest, False)
)
print('\nTest set accuracy: {accuracy:0.3f}\n'.format(**eval_result))
print('\nSaving models...')
classifier.export_savedmodel("modeliris2pb", my_serving_input_fn)
if __name__ == "__main__":
tf.logging.set_verbosity(tf.logging.INFO)
tf.app.run(main)
这将生成一个 saved_model.pb
文件。我已经确认该模型有效。我还可以制作另一个程序来加载并 运行s 它。现在,我想使用 Bazel 总结和冻结模型。如果我构建 Bazel 然后 运行 以下命令:
bazel-bin/tensorflow/tools/graph_transforms/summarize_graph \
--in_graph=saved_model.pb
我收到以下错误:
[libprotobuf ERROR external/protobuf_archive/src/google/protobuf/text_format.cc:307] Error parsing text-format tensorflow.GraphDef: 1:1: Invalid control characters encountered in text.
[libprotobuf ERROR external/protobuf_archive/src/google/protobuf/text_format.cc:307] Error parsing text-format tensorflow.GraphDef: 1:4: Interpreting non ascii codepoint 218.
[libprotobuf ERROR external/protobuf_archive/src/google/protobuf/text_format.cc:307] Error parsing text-format tensorflow.GraphDef: 1:4: Expected identifier, got: �
2018-08-14 11:50:17.759617: E tensorflow/tools/graph_transforms/summarize_graph_main.cc:320] Loading graph 'saved_model.pb' failed with Can't parse saved_model.pb as binary proto
(both text and binary parsing failed for file saved_model.pb)
2018-08-14 11:50:17.759670: E tensorflow/tools/graph_transforms/summarize_graph_main.cc:322] usage: bazel-bin/tensorflow/tools/graph_transforms/summarize_graph
Flags:
--in_graph="" string input graph file name
--print_structure=false bool whether to print the network connections of the graph
我不明白这个错误。我试过使用 inception pb file 并且效果很好,所以我认为问题在于 tf.estimator
如何构建 .pb
文件。
我在使用 export_savedmodel()
或 tf.estimator
创建已保存模型时是否遗漏了什么?
更新
Tensorflow 版本:v1.9.0-0-g25c197e023 1.9.0
tf_env_collect.sh
的结果:
== cat /etc/issue ===============================================
Linux rianadam 4.15.0-32-generic #35-Ubuntu SMP Fri Aug 10 17:58:07 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
VERSION="18.04.1 LTS (Bionic Beaver)"
VERSION_ID="18.04"
VERSION_CODENAME=bionic
== are we in docker =============================================
No
== compiler =====================================================
c++ (Ubuntu 7.3.0-16ubuntu3) 7.3.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
== uname -a =====================================================
Linux rianadam 4.15.0-32-generic #35-Ubuntu SMP Fri Aug 10 17:58:07 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
== check pips ===================================================
numpy 1.15.0
protobuf 3.6.0
tensorflow-gpu 1.9.0
== check for virtualenv =========================================
True
== tensorflow import ============================================
tf.VERSION = 1.9.0
tf.GIT_VERSION = v1.9.0-0-g25c197e023
tf.COMPILER_VERSION = v1.9.0-0-g25c197e023
Sanity check: array([1], dtype=int32)
/home/rian/NgodingYuk/tf_env/env/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
return f(*args, **kwds)
/home/rian/NgodingYuk/tf_env/env/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
return f(*args, **kwds)
== env ==========================================================
LD_LIBRARY_PATH /usr/local/cuda/lib64:/usr/local/cuda-9.0/lib64:/usr/local/cuda/lib64:/usr/local/cuda-9.0/lib64:
DYLD_LIBRARY_PATH is unset
== nvidia-smi ===================================================
Tue Aug 21 11:13:55 2018
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 390.77 Driver Version: 390.77 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce 920M Off | 00000000:04:00.0 N/A | N/A |
| N/A 51C P0 N/A / N/A | 367MiB / 2004MiB | N/A Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 Not Supported |
+-----------------------------------------------------------------------------+
== cuda libs ===================================================
/usr/local/cuda-9.0/lib64/libcudart_static.a
/usr/local/cuda-9.0/lib64/libcudart.so.9.0.176
/usr/local/cuda-9.0/doc/man/man7/libcudart.7
/usr/local/cuda-9.0/doc/man/man7/libcudart.so.7
当我试图从我使用 custom tf.Estimator 训练的模型中找到 input/output 节点时,我 运行 遇到了同样的问题。错误是因为使用 export_savedmodel
时得到的输出是 servable (这是一个,因为我目前理解它,一个 GraphDef
和其他元数据)而不仅仅是一个 GraphDef
.
找到输入输出节点,你可以做到。
# -*- coding: utf-8 -*-
import tensorflow as tf
from tensorflow.saved_model import tag_constants
with tf.Session(graph=tf.Graph()) as sess:
gf = tf.saved_model.loader.load(
sess,
[tf.saved_model.tag_constants.SERVING],
"/path/to/saved/model/")
nodes = gf.graph_def.node
print([n.name + " -> " + n.op for n in nodes
if n.op in ('Softmax', 'Placeholder')])
# ... ['Placeholder -> Placeholder',
# 'dnn/head/predictions/probabilities -> Softmax']
我也使用了固定的 DNNEstimator,所以 OP 的节点应该和我的一样,其他用户,您的操作名称可能与 Placeholder
和 Softmax
不同,具体取决于您的分类器。
现在您有了 input/output 个节点的名称,您可以冻结图,地址为 here
If you want to work with the values of your trained parameters, for example to quantize weights, you'll need to run tensorflow/python/tools/freeze_graph.py to convert the checkpoint values into embedded constants within the graph file itself.
#!/bin/bash
python ./freeze_graph.py \
--in_graph="/path/to/model/saved_model.pb" \
--input_checkpoint="/MyModel/model.ckpt-xxxx" \
--output_graph="/home/user/pruned_saved_model_or_whatever.pb" \
--input_saved_model_dir="/path/to/model" \
--output_node_names="dnn/head/predictions/probabilities" \
然后假设您已经 graph_transforms
建造了
#!/bin/bash
tensorflow/bazel-bin/tensorflow/tools/graph_transforms/summarize_graph \
--in_graph=pruned_saved_model_or_whatever.pb
输出:
Found 1 possible inputs: (name=Placeholder, type=string(7), shape=[?])
No variables spotted.
Found 1 possible outputs: (name=dnn/head/predictions/probabilities, op=Softmax)
Found 256974297 (256.97M) const parameters, 0 (0) variable parameters, and 0
control_edges
Op types used: 155 Const, 41 Identity, 32 RegexReplace, 18 Gather, 9
StridedSlice, 9 MatMul, 6 Shape, 6 Reshape, 6 Relu, 5 ConcatV2, 4 BiasAdd, 4
Add, 3 ExpandDims, 3 Pack, 2 NotEqual, 2 Where, 2 Select, 2 StringJoin, 2 Cast,
2 DynamicPartition, 2 Fill, 2 Maximum, 1 Size, 1 Unique, 1 Tanh, 1 Sum, 1
StringToHashBucketFast, 1 StringSplit, 1 Equal, 1 Squeeze, 1 Square, 1
SparseToDense, 1 SparseSegmentSqrtN, 1 SparseFillEmptyRows, 1 Softmax, 1
FloorDiv, 1 Rsqrt, 1 FloorMod, 1 HashTableV2, 1 LookupTableFindV2, 1 Range, 1
Prod, 1 Placeholder, 1 ParallelDynamicStitch, 1 LookupTableSizeV2, 1 Max, 1 Mul
To use with tensorflow/tools/benchmark:benchmark_model try these arguments:
bazel run tensorflow/tools/benchmark:benchmark_model -- --
graph=pruned_saved_model.pb --show_flops --input_layer=Placeholder --
input_layer_type=string --input_layer_shape=-1 --
output_layer=dnn/head/predictions/probabilities
希望这对您有所帮助。
更新 (2018-12-03):
我打开了一个相关的github issue好像在一个详细的博客post中解决了,在工单末尾列出。
我正在尝试使用 tf.estimator
和 export_savedmodel()
制作一个 *.pb 模型,它是一个简单的分类器,用于对鸢尾花数据集进行分类(4 个特征,3 个 类):
import tensorflow as tf
num_epoch = 500
num_train = 120
num_test = 30
# 1 Define input function
def input_function(x, y, is_train):
dict_x = {
"thisisinput" : x,
}
dataset = tf.data.Dataset.from_tensor_slices((
dict_x, y
))
if is_train:
dataset = dataset.shuffle(num_train).repeat(num_epoch).batch(num_train)
else:
dataset = dataset.batch(num_test)
return dataset
def my_serving_input_fn():
input_data = tf.placeholder(tf.string, [None], name='input_tensors')
receiver_tensors = {"inputs" : input_data}
# 2 Define feature columns
feature_columns = [
tf.feature_column.numeric_column(key="thisisinput", shape=4),]
features = tf.parse_example(
input_data,
tf.feature_column.make_parse_example_spec(feature_columns))
return tf.estimator.export.ServingInputReceiver(features, receiver_tensors)
def main(argv):
tf.set_random_seed(1103) # avoiding different result of random
# 2 Define feature columns
feature_columns = [
tf.feature_column.numeric_column(key="thisisinput", shape=4),]
# 3 Define an estimator
classifier = tf.estimator.DNNClassifier(
feature_columns=feature_columns,
hidden_units=[10],
n_classes=3,
optimizer=tf.train.GradientDescentOptimizer(0.001),
activation_fn=tf.nn.relu,
model_dir = 'modeliris2/'
)
# Train the model
classifier.train(
input_fn=lambda:input_function(xtrain, ytrain, True)
)
# Evaluate the model
eval_result = classifier.evaluate(
input_fn=lambda:input_function(xtest, ytest, False)
)
print('\nTest set accuracy: {accuracy:0.3f}\n'.format(**eval_result))
print('\nSaving models...')
classifier.export_savedmodel("modeliris2pb", my_serving_input_fn)
if __name__ == "__main__":
tf.logging.set_verbosity(tf.logging.INFO)
tf.app.run(main)
这将生成一个 saved_model.pb
文件。我已经确认该模型有效。我还可以制作另一个程序来加载并 运行s 它。现在,我想使用 Bazel 总结和冻结模型。如果我构建 Bazel 然后 运行 以下命令:
bazel-bin/tensorflow/tools/graph_transforms/summarize_graph \
--in_graph=saved_model.pb
我收到以下错误:
[libprotobuf ERROR external/protobuf_archive/src/google/protobuf/text_format.cc:307] Error parsing text-format tensorflow.GraphDef: 1:1: Invalid control characters encountered in text.
[libprotobuf ERROR external/protobuf_archive/src/google/protobuf/text_format.cc:307] Error parsing text-format tensorflow.GraphDef: 1:4: Interpreting non ascii codepoint 218.
[libprotobuf ERROR external/protobuf_archive/src/google/protobuf/text_format.cc:307] Error parsing text-format tensorflow.GraphDef: 1:4: Expected identifier, got: �
2018-08-14 11:50:17.759617: E tensorflow/tools/graph_transforms/summarize_graph_main.cc:320] Loading graph 'saved_model.pb' failed with Can't parse saved_model.pb as binary proto
(both text and binary parsing failed for file saved_model.pb)
2018-08-14 11:50:17.759670: E tensorflow/tools/graph_transforms/summarize_graph_main.cc:322] usage: bazel-bin/tensorflow/tools/graph_transforms/summarize_graph
Flags:
--in_graph="" string input graph file name
--print_structure=false bool whether to print the network connections of the graph
我不明白这个错误。我试过使用 inception pb file 并且效果很好,所以我认为问题在于 tf.estimator
如何构建 .pb
文件。
我在使用 export_savedmodel()
或 tf.estimator
创建已保存模型时是否遗漏了什么?
更新
Tensorflow 版本:v1.9.0-0-g25c197e023 1.9.0
tf_env_collect.sh
的结果:
== cat /etc/issue ===============================================
Linux rianadam 4.15.0-32-generic #35-Ubuntu SMP Fri Aug 10 17:58:07 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
VERSION="18.04.1 LTS (Bionic Beaver)"
VERSION_ID="18.04"
VERSION_CODENAME=bionic
== are we in docker =============================================
No
== compiler =====================================================
c++ (Ubuntu 7.3.0-16ubuntu3) 7.3.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
== uname -a =====================================================
Linux rianadam 4.15.0-32-generic #35-Ubuntu SMP Fri Aug 10 17:58:07 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
== check pips ===================================================
numpy 1.15.0
protobuf 3.6.0
tensorflow-gpu 1.9.0
== check for virtualenv =========================================
True
== tensorflow import ============================================
tf.VERSION = 1.9.0
tf.GIT_VERSION = v1.9.0-0-g25c197e023
tf.COMPILER_VERSION = v1.9.0-0-g25c197e023
Sanity check: array([1], dtype=int32)
/home/rian/NgodingYuk/tf_env/env/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
return f(*args, **kwds)
/home/rian/NgodingYuk/tf_env/env/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: numpy.dtype size changed, may indicate binary incompatibility. Expected 96, got 88
return f(*args, **kwds)
== env ==========================================================
LD_LIBRARY_PATH /usr/local/cuda/lib64:/usr/local/cuda-9.0/lib64:/usr/local/cuda/lib64:/usr/local/cuda-9.0/lib64:
DYLD_LIBRARY_PATH is unset
== nvidia-smi ===================================================
Tue Aug 21 11:13:55 2018
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 390.77 Driver Version: 390.77 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce 920M Off | 00000000:04:00.0 N/A | N/A |
| N/A 51C P0 N/A / N/A | 367MiB / 2004MiB | N/A Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 Not Supported |
+-----------------------------------------------------------------------------+
== cuda libs ===================================================
/usr/local/cuda-9.0/lib64/libcudart_static.a
/usr/local/cuda-9.0/lib64/libcudart.so.9.0.176
/usr/local/cuda-9.0/doc/man/man7/libcudart.7
/usr/local/cuda-9.0/doc/man/man7/libcudart.so.7
当我试图从我使用 custom tf.Estimator 训练的模型中找到 input/output 节点时,我 运行 遇到了同样的问题。错误是因为使用 export_savedmodel
时得到的输出是 servable (这是一个,因为我目前理解它,一个 GraphDef
和其他元数据)而不仅仅是一个 GraphDef
.
找到输入输出节点,你可以做到。
# -*- coding: utf-8 -*-
import tensorflow as tf
from tensorflow.saved_model import tag_constants
with tf.Session(graph=tf.Graph()) as sess:
gf = tf.saved_model.loader.load(
sess,
[tf.saved_model.tag_constants.SERVING],
"/path/to/saved/model/")
nodes = gf.graph_def.node
print([n.name + " -> " + n.op for n in nodes
if n.op in ('Softmax', 'Placeholder')])
# ... ['Placeholder -> Placeholder',
# 'dnn/head/predictions/probabilities -> Softmax']
我也使用了固定的 DNNEstimator,所以 OP 的节点应该和我的一样,其他用户,您的操作名称可能与 Placeholder
和 Softmax
不同,具体取决于您的分类器。
现在您有了 input/output 个节点的名称,您可以冻结图,地址为 here
If you want to work with the values of your trained parameters, for example to quantize weights, you'll need to run tensorflow/python/tools/freeze_graph.py to convert the checkpoint values into embedded constants within the graph file itself.
#!/bin/bash
python ./freeze_graph.py \
--in_graph="/path/to/model/saved_model.pb" \
--input_checkpoint="/MyModel/model.ckpt-xxxx" \
--output_graph="/home/user/pruned_saved_model_or_whatever.pb" \
--input_saved_model_dir="/path/to/model" \
--output_node_names="dnn/head/predictions/probabilities" \
然后假设您已经 graph_transforms
建造了
#!/bin/bash
tensorflow/bazel-bin/tensorflow/tools/graph_transforms/summarize_graph \
--in_graph=pruned_saved_model_or_whatever.pb
输出:
Found 1 possible inputs: (name=Placeholder, type=string(7), shape=[?])
No variables spotted.
Found 1 possible outputs: (name=dnn/head/predictions/probabilities, op=Softmax)
Found 256974297 (256.97M) const parameters, 0 (0) variable parameters, and 0
control_edges
Op types used: 155 Const, 41 Identity, 32 RegexReplace, 18 Gather, 9
StridedSlice, 9 MatMul, 6 Shape, 6 Reshape, 6 Relu, 5 ConcatV2, 4 BiasAdd, 4
Add, 3 ExpandDims, 3 Pack, 2 NotEqual, 2 Where, 2 Select, 2 StringJoin, 2 Cast,
2 DynamicPartition, 2 Fill, 2 Maximum, 1 Size, 1 Unique, 1 Tanh, 1 Sum, 1
StringToHashBucketFast, 1 StringSplit, 1 Equal, 1 Squeeze, 1 Square, 1
SparseToDense, 1 SparseSegmentSqrtN, 1 SparseFillEmptyRows, 1 Softmax, 1
FloorDiv, 1 Rsqrt, 1 FloorMod, 1 HashTableV2, 1 LookupTableFindV2, 1 Range, 1
Prod, 1 Placeholder, 1 ParallelDynamicStitch, 1 LookupTableSizeV2, 1 Max, 1 Mul
To use with tensorflow/tools/benchmark:benchmark_model try these arguments:
bazel run tensorflow/tools/benchmark:benchmark_model -- --
graph=pruned_saved_model.pb --show_flops --input_layer=Placeholder --
input_layer_type=string --input_layer_shape=-1 --
output_layer=dnn/head/predictions/probabilities
希望这对您有所帮助。
更新 (2018-12-03):
我打开了一个相关的github issue好像在一个详细的博客post中解决了,在工单末尾列出。