在 Android 中使用 BI LSTM CTC Tensorflow 模型
Using BI LSTM CTC Tensorflow Model in Android
TL;DR,我想知道如何在 android 应用程序中使用 bi-lstm-ctc 张量流模型。
我已经成功训练了我的 bi-lstm-ctc 张量流模型,现在我想将它用于我的手写识别 android 应用程序。这是定义我使用的图形的代码部分:
self.inputs = tf.placeholder(tf.float32, [None, None, network_config.num_features], name="input")
self.labels = tf.sparse_placeholder(tf.int32, name="label")
self.seq_len = tf.placeholder(tf.int32, [None], name="seq_len_input")
logits = self._bidirectional_lstm_layers(
network_config.num_hidden_units,
network_config.num_layers,
network_config.num_classes
)
self.global_step = tf.Variable(0, trainable=False)
self.loss = tf.nn.ctc_loss(labels=self.labels, inputs=logits, sequence_length=self.seq_len)
self.cost = tf.reduce_mean(self.loss)
self.optimizer = tf.train.AdamOptimizer(network_config.learning_rate).minimize(self.cost)
self.decoded, self.log_prob = tf.nn.ctc_beam_search_decoder(inputs=logits, sequence_length=self.seq_len, merge_repeated=False)
self.dense_decoded = tf.sparse_tensor_to_dense(self.decoded[0], default_value=-1, name="output")
在这个tutorial中冻结和优化图代码后,我也成功地冻结和优化了图。这是应该 运行 模型的代码部分:
bitmap = Bitmap.createScaledBitmap(bitmap, 1024, 128, true);
int[] intValues = new int[bitmap.getWidth() * bitmap.getHeight()];
bitmap.getPixels(intValues, 0, bitmap.getWidth(), 0, 0, bitmap.getWidth(), bitmap.getHeight());
float[] floatValues = new float[bitmap.getWidth() * bitmap.getHeight()];
for (int i = 0; i < intValues.length; ++i) {
final int val = intValues[i];
floatValues[i] = (((val >> 16) & 0xFF));
}
float[] result = new float[80];
long[] INPUT_SIZE = new long[]{1, bitmap.getHeight(), bitmap.getWidth()};
inferenceInterface.feed(config.getInputName(), floatValues, INPUT_SIZE);
inferenceInterface.feed("seq_len_input", new int[]{bitmap.getWidth()}, 1);
inferenceInterface.run(config.getOutputNames());
inferenceInterface.fetch(config.getOutputNames()[0], result);
return result.toString();
但是,根据我使用的型号,我会遇到这些问题。如果我使用冻结图,我会遇到这个错误:
Caused by: java.lang.IllegalArgumentException: No OpKernel was registered to support
Op 'SparseToDense' with these attrs. Registered devices: [CPU], Registered kernels:
device='CPU'; T in [DT_STRING]; Tindices in [DT_INT64]
device='CPU'; T in [DT_STRING]; Tindices in [DT_INT32]
device='CPU'; T in [DT_BOOL]; Tindices in [DT_INT64]
device='CPU'; T in [DT_BOOL]; Tindices in [DT_INT32]
device='CPU'; T in [DT_FLOAT]; Tindices in [DT_INT64]
device='CPU'; T in [DT_FLOAT]; Tindices in [DT_INT32]
device='CPU'; T in [DT_INT32]; Tindices in [DT_INT64]
device='CPU'; T in [DT_INT32]; Tindices in [DT_INT32]
[[Node: output = SparseToDense[T=DT_INT64, Tindices=DT_INT64, validate_indices=true](CTCBeamSearchDecoder, CTCBeamSearchDecoder:2, CTCBeamSearchDecoder:1, output/default_value)]]
如果我使用优化的冻结图,我会遇到这个错误:
java.io.IOException: Not a valid TensorFlow Graph serialization: NodeDef expected inputs '' do not match 1 inputs
specified; Op<name=Const; signature= -> output:dtype; attr=value:tensor; attr=dtype:type>;
NodeDef: stack_bidirectional_rnn/cell_0/bidirectional_rnn/bw/bw/while/add/y = Const[dtype=DT_INT32,
value=Tensor<type: int32 shape: [] values: 1>](stack_bidirectional_rnn/cell_0/bidirectional_rnn/bw/bw/while/Switch:1)
除了解决这些错误的方法,我还有其他方法questions/clarifications:
如何解决这些错误?
我已经成功了。也可以在这个 github issue.
中找到解决方案
显然,问题出在使用的类型上。我传递的是 int64,其中只接受 int32。
self.dense_decoded = tf.sparse_tensor_to_dense(self.decoded[0], default_value=-1, name="output")
为了解决这个问题,我将稀疏张量元素转换为 int32:
self.dense_decoded = tf.sparse_to_dense(tf.to_int32(self.decoded[0].indices),
tf.to_int32(self.decoded[0].dense_shape),
tf.to_int32(self.decoded[0].values),
name="output")
运行 之后的应用程序给我这个错误:
java.lang.IllegalArgumentException: Matrix size-incompatible: In[0]: [1,1056], In[1]: [160,128]
[[Node:stack_bidirectional_rnn/cell_0/bidirectional_rnn/bw/bw/while/bw/basic_lstm_cell/basic_lstm_cell/
MatMul = MatMul[T=DT_FLOAT, transpose_a=false, transpose_b=false, _device="/job:localhost/replica:0/task:0/cpu:0"]
(stack_bidirectional_rnn/cell_0/bidirectional_rnn/bw/bw/while/bw/basic_lstm_cell/basic_lstm_cell/concat,
stack_bidirectional_rnn/cell_0/bidirectional_rnn/bw/bw/while/bw/basic_lstm_cell/basic_lstm_cell/MatMul/Enter)]]
出于某些奇怪的原因,在 java 代码中将图像宽度从 1024 更改为 128 修复了该错误。 运行 应用程序又给我这个错误:
java.lang.IllegalArgumentException: cannot use java.nio.FloatArrayBuffer with Tensor of type INT32
获取输出时出现问题。这样,我就成功地知道了模型 运行 但应用程序无法获取结果。
inferenceInterface.run(outputs);
inferenceInterface.fetch(outputs[0], result); //where the error happens
愚蠢的我忘了输出是一个整数数组,而不是一个浮点数数组。因此,我将结果数组的类型更改为 int 数组:
//float[] result = new float[80];
int[] result = new int[80];
从而使应用程序运行。模型的准确性不好,因为它没有经过适当的训练。我只是想让它在应用程序中工作。是时候进行一些认真的训练了!
TL;DR,我想知道如何在 android 应用程序中使用 bi-lstm-ctc 张量流模型。
我已经成功训练了我的 bi-lstm-ctc 张量流模型,现在我想将它用于我的手写识别 android 应用程序。这是定义我使用的图形的代码部分:
self.inputs = tf.placeholder(tf.float32, [None, None, network_config.num_features], name="input")
self.labels = tf.sparse_placeholder(tf.int32, name="label")
self.seq_len = tf.placeholder(tf.int32, [None], name="seq_len_input")
logits = self._bidirectional_lstm_layers(
network_config.num_hidden_units,
network_config.num_layers,
network_config.num_classes
)
self.global_step = tf.Variable(0, trainable=False)
self.loss = tf.nn.ctc_loss(labels=self.labels, inputs=logits, sequence_length=self.seq_len)
self.cost = tf.reduce_mean(self.loss)
self.optimizer = tf.train.AdamOptimizer(network_config.learning_rate).minimize(self.cost)
self.decoded, self.log_prob = tf.nn.ctc_beam_search_decoder(inputs=logits, sequence_length=self.seq_len, merge_repeated=False)
self.dense_decoded = tf.sparse_tensor_to_dense(self.decoded[0], default_value=-1, name="output")
在这个tutorial中冻结和优化图代码后,我也成功地冻结和优化了图。这是应该 运行 模型的代码部分:
bitmap = Bitmap.createScaledBitmap(bitmap, 1024, 128, true);
int[] intValues = new int[bitmap.getWidth() * bitmap.getHeight()];
bitmap.getPixels(intValues, 0, bitmap.getWidth(), 0, 0, bitmap.getWidth(), bitmap.getHeight());
float[] floatValues = new float[bitmap.getWidth() * bitmap.getHeight()];
for (int i = 0; i < intValues.length; ++i) {
final int val = intValues[i];
floatValues[i] = (((val >> 16) & 0xFF));
}
float[] result = new float[80];
long[] INPUT_SIZE = new long[]{1, bitmap.getHeight(), bitmap.getWidth()};
inferenceInterface.feed(config.getInputName(), floatValues, INPUT_SIZE);
inferenceInterface.feed("seq_len_input", new int[]{bitmap.getWidth()}, 1);
inferenceInterface.run(config.getOutputNames());
inferenceInterface.fetch(config.getOutputNames()[0], result);
return result.toString();
但是,根据我使用的型号,我会遇到这些问题。如果我使用冻结图,我会遇到这个错误:
Caused by: java.lang.IllegalArgumentException: No OpKernel was registered to support
Op 'SparseToDense' with these attrs. Registered devices: [CPU], Registered kernels:
device='CPU'; T in [DT_STRING]; Tindices in [DT_INT64]
device='CPU'; T in [DT_STRING]; Tindices in [DT_INT32]
device='CPU'; T in [DT_BOOL]; Tindices in [DT_INT64]
device='CPU'; T in [DT_BOOL]; Tindices in [DT_INT32]
device='CPU'; T in [DT_FLOAT]; Tindices in [DT_INT64]
device='CPU'; T in [DT_FLOAT]; Tindices in [DT_INT32]
device='CPU'; T in [DT_INT32]; Tindices in [DT_INT64]
device='CPU'; T in [DT_INT32]; Tindices in [DT_INT32]
[[Node: output = SparseToDense[T=DT_INT64, Tindices=DT_INT64, validate_indices=true](CTCBeamSearchDecoder, CTCBeamSearchDecoder:2, CTCBeamSearchDecoder:1, output/default_value)]]
如果我使用优化的冻结图,我会遇到这个错误:
java.io.IOException: Not a valid TensorFlow Graph serialization: NodeDef expected inputs '' do not match 1 inputs
specified; Op<name=Const; signature= -> output:dtype; attr=value:tensor; attr=dtype:type>;
NodeDef: stack_bidirectional_rnn/cell_0/bidirectional_rnn/bw/bw/while/add/y = Const[dtype=DT_INT32,
value=Tensor<type: int32 shape: [] values: 1>](stack_bidirectional_rnn/cell_0/bidirectional_rnn/bw/bw/while/Switch:1)
除了解决这些错误的方法,我还有其他方法questions/clarifications:
如何解决这些错误?
我已经成功了。也可以在这个 github issue.
中找到解决方案显然,问题出在使用的类型上。我传递的是 int64,其中只接受 int32。
self.dense_decoded = tf.sparse_tensor_to_dense(self.decoded[0], default_value=-1, name="output")
为了解决这个问题,我将稀疏张量元素转换为 int32:
self.dense_decoded = tf.sparse_to_dense(tf.to_int32(self.decoded[0].indices),
tf.to_int32(self.decoded[0].dense_shape),
tf.to_int32(self.decoded[0].values),
name="output")
运行 之后的应用程序给我这个错误:
java.lang.IllegalArgumentException: Matrix size-incompatible: In[0]: [1,1056], In[1]: [160,128]
[[Node:stack_bidirectional_rnn/cell_0/bidirectional_rnn/bw/bw/while/bw/basic_lstm_cell/basic_lstm_cell/
MatMul = MatMul[T=DT_FLOAT, transpose_a=false, transpose_b=false, _device="/job:localhost/replica:0/task:0/cpu:0"]
(stack_bidirectional_rnn/cell_0/bidirectional_rnn/bw/bw/while/bw/basic_lstm_cell/basic_lstm_cell/concat,
stack_bidirectional_rnn/cell_0/bidirectional_rnn/bw/bw/while/bw/basic_lstm_cell/basic_lstm_cell/MatMul/Enter)]]
出于某些奇怪的原因,在 java 代码中将图像宽度从 1024 更改为 128 修复了该错误。 运行 应用程序又给我这个错误:
java.lang.IllegalArgumentException: cannot use java.nio.FloatArrayBuffer with Tensor of type INT32
获取输出时出现问题。这样,我就成功地知道了模型 运行 但应用程序无法获取结果。
inferenceInterface.run(outputs);
inferenceInterface.fetch(outputs[0], result); //where the error happens
愚蠢的我忘了输出是一个整数数组,而不是一个浮点数数组。因此,我将结果数组的类型更改为 int 数组:
//float[] result = new float[80];
int[] result = new int[80];
从而使应用程序运行。模型的准确性不好,因为它没有经过适当的训练。我只是想让它在应用程序中工作。是时候进行一些认真的训练了!