Converting Tensorflow Graph to use Estimator, get 'TypeError: data type not understood' at loss function using `sampled_softmax_loss` or `nce_loss`
Converting Tensorflow Graph to use Estimator, get 'TypeError: data type not understood' at loss function using `sampled_softmax_loss` or `nce_loss`
我正在尝试将 Tensorflow 的官方基本 word2vec 实现转换为使用 tf.Estimator。
问题是损失函数( sampled_softmax_loss
或 nce_loss
)在使用 Tensorflow Estimator 时会出错。它在原始实现中工作得很好。
下面是Tensorflow的官方基本word2vec实现:
这是 Google Colab notebook,我在其中实现了这段代码,它可以正常工作。
https://colab.research.google.com/drive/1nTX77dRBHmXx6PEF5pmYpkIVxj_TqT5I
这是 Google Colab notebook,我在其中更改了代码,以便它使用无法正常工作的 Tensorflow Estimator。
https://colab.research.google.com/drive/1IVDqGwMx6BK5-Bgrw190jqHU6tt3ZR3e
为方便起见,这里是我在上面定义 model_fn
的 Estimator 版本中的确切代码
batch_size = 128
embedding_size = 128 # Dimension of the embedding vector.
skip_window = 1 # How many words to consider left and right.
num_skips = 2 # How many times to reuse an input to generate a label.
num_sampled = 64 # Number of negative examples to sample.
def my_model( features, labels, mode, params):
with tf.name_scope('inputs'):
train_inputs = features
train_labels = labels
with tf.name_scope('embeddings'):
embeddings = tf.Variable(
tf.random_uniform([vocabulary_size, embedding_size], -1.0, 1.0))
embed = tf.nn.embedding_lookup(embeddings, train_inputs)
with tf.name_scope('weights'):
nce_weights = tf.Variable(
tf.truncated_normal(
[vocabulary_size, embedding_size],
stddev=1.0 / math.sqrt(embedding_size)))
with tf.name_scope('biases'):
nce_biases = tf.Variable(tf.zeros([vocabulary_size]))
with tf.name_scope('loss'):
loss = tf.reduce_mean(
tf.nn.nce_loss(
weights=nce_weights,
biases=nce_biases,
labels=train_labels,
inputs=embed,
num_sampled=num_sampled,
num_classes=vocabulary_size))
tf.summary.scalar('loss', loss)
if mode == "train":
with tf.name_scope('optimizer'):
optimizer = tf.train.GradientDescentOptimizer(1.0).minimize(loss)
return tf.estimator.EstimatorSpec(mode, loss=loss, train_op=optimizer)
这里是我调用估算器和训练的地方
word2vecEstimator = tf.estimator.Estimator(
model_fn=my_model,
params={
'batch_size': 16,
'embedding_size': 10,
'num_inputs': 3,
'num_sampled': 128,
'batch_size': 16
})
word2vecEstimator.train(
input_fn=generate_batch,
steps=10)
这是调用 Estimator 训练时收到的错误消息:
INFO:tensorflow:Calling model_fn.
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-22-955f44867ee5> in <module>()
1 word2vecEstimator.train(
2 input_fn=generate_batch,
----> 3 steps=10)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/estimator/estimator.py in train(self, input_fn, hooks, steps, max_steps, saving_listeners)
352
353 saving_listeners = _check_listeners_type(saving_listeners)
--> 354 loss = self._train_model(input_fn, hooks, saving_listeners)
355 logging.info('Loss for final step: %s.', loss)
356 return self
/usr/local/lib/python3.6/dist-packages/tensorflow/python/estimator/estimator.py in _train_model(self, input_fn, hooks, saving_listeners)
1205 return self._train_model_distributed(input_fn, hooks, saving_listeners)
1206 else:
-> 1207 return self._train_model_default(input_fn, hooks, saving_listeners)
1208
1209 def _train_model_default(self, input_fn, hooks, saving_listeners):
/usr/local/lib/python3.6/dist-packages/tensorflow/python/estimator/estimator.py in _train_model_default(self, input_fn, hooks, saving_listeners)
1235 worker_hooks.extend(input_hooks)
1236 estimator_spec = self._call_model_fn(
-> 1237 features, labels, model_fn_lib.ModeKeys.TRAIN, self.config)
1238 global_step_tensor = training_util.get_global_step(g)
1239 return self._train_with_estimator_spec(estimator_spec, worker_hooks,
/usr/local/lib/python3.6/dist-packages/tensorflow/python/estimator/estimator.py in _call_model_fn(self, features, labels, mode, config)
1193
1194 logging.info('Calling model_fn.')
-> 1195 model_fn_results = self._model_fn(features=features, **kwargs)
1196 logging.info('Done calling model_fn.')
1197
<ipython-input-20-9d389437162a> in my_model(features, labels, mode, params)
33 inputs=embed,
34 num_sampled=num_sampled,
---> 35 num_classes=vocabulary_size))
36
37 # Add the loss value as a scalar to summary.
/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/nn_impl.py in nce_loss(weights, biases, labels, inputs, num_sampled, num_classes, num_true, sampled_values, remove_accidental_hits, partition_strategy, name)
1246 remove_accidental_hits=remove_accidental_hits,
1247 partition_strategy=partition_strategy,
-> 1248 name=name)
1249 sampled_losses = sigmoid_cross_entropy_with_logits(
1250 labels=labels, logits=logits, name="sampled_losses")
/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/nn_impl.py in _compute_sampled_logits(weights, biases, labels, inputs, num_sampled, num_classes, num_true, sampled_values, subtract_log_q, remove_accidental_hits, partition_strategy, name, seed)
1029 with ops.name_scope(name, "compute_sampled_logits",
1030 weights + [biases, inputs, labels]):
-> 1031 if labels.dtype != dtypes.int64:
1032 labels = math_ops.cast(labels, dtypes.int64)
1033 labels_flat = array_ops.reshape(labels, [-1])
TypeError: data type not understood
编辑:根据要求,input_fn 的典型输出如下所示
print(generate_batch(batch_size=8, num_skips=2, skip_window=1))
(array([3081, 3081, 12, 12, 6, 6, 195, 195], dtype=int32), array([[5234],
[ 12],
[ 6],
[3081],
[ 12],
[ 195],
[ 6],
[ 2]], dtype=int32))
您在此处将 generate_batch
用作变量:
word2vecEstimator.train(
input_fn=generate_batch,
steps=10)
用generate_batch()
调用函数。
但我认为你必须将一些值传递给函数。
可能张量和ops必须在input_fn
,而不是在'model_fn'
I found this issue #4026 which solved my problem ... Maybe it is just me being stupid, but it would be great if you mention that the tensors and ops all have to be inside the input_fn somewhere in the documentation.
You have to call read_batch_examples from somewhere inside input_fn so that the tensors it creates are in the graph that Estimator creates in fit().
https://github.com/tensorflow/tensorflow/issues/8042
Oh I feel like an idiot! I've been creating the op outside of the graph scope. It works now, can't believe I didn't think to try that. Thanks a lot! This is a non-issue and has been resolved
https://github.com/tensorflow/tensorflow/issues/4026
但是,仍然没有足够的信息来说明导致问题的原因。这只是一个线索。
找到答案
Error clearly says you have invalid type for labels.
You trying to pass numpy array instead of Tensor. Sometimes Tensorflow
performs implicit conversion from ndarray to Tensor under the hood
(what's why your code works outside of Estimator), but in this case it
don't.
.
No, official impl. feeds data from a placeholder. Placeholder is
always a Tensor, so it don't depends on implicit things.
But if you directly call loss function with a numpy array as input
(Notice: call during graph construction phase, so argument content
gets embedded into graph), it MAY work (however, I did not check it).
This code:
nce_loss(labels=[1,2,3]) will be called only ONCE during graph
construction. Labels will be statically embedded into graph as a
constant and potentially can be of any Tensor-compatible type (list,
ndarray, etc)
This code: ```Python def model(label_input):
nce_loss(labels=label_input)
estimator(model_fun=model).train() ``` can't embed labels variable
statically, because it content is not defined during graph
construction. So if you feed anything except the Tensor, it will throw
an error.
来自
所以我使用了 labels=tf.dtypes.cast( train_labels, tf.int64)
并且有效
我正在尝试将 Tensorflow 的官方基本 word2vec 实现转换为使用 tf.Estimator。
问题是损失函数( sampled_softmax_loss
或 nce_loss
)在使用 Tensorflow Estimator 时会出错。它在原始实现中工作得很好。
下面是Tensorflow的官方基本word2vec实现:
这是 Google Colab notebook,我在其中实现了这段代码,它可以正常工作。
https://colab.research.google.com/drive/1nTX77dRBHmXx6PEF5pmYpkIVxj_TqT5I
这是 Google Colab notebook,我在其中更改了代码,以便它使用无法正常工作的 Tensorflow Estimator。
https://colab.research.google.com/drive/1IVDqGwMx6BK5-Bgrw190jqHU6tt3ZR3e
为方便起见,这里是我在上面定义 model_fn
batch_size = 128
embedding_size = 128 # Dimension of the embedding vector.
skip_window = 1 # How many words to consider left and right.
num_skips = 2 # How many times to reuse an input to generate a label.
num_sampled = 64 # Number of negative examples to sample.
def my_model( features, labels, mode, params):
with tf.name_scope('inputs'):
train_inputs = features
train_labels = labels
with tf.name_scope('embeddings'):
embeddings = tf.Variable(
tf.random_uniform([vocabulary_size, embedding_size], -1.0, 1.0))
embed = tf.nn.embedding_lookup(embeddings, train_inputs)
with tf.name_scope('weights'):
nce_weights = tf.Variable(
tf.truncated_normal(
[vocabulary_size, embedding_size],
stddev=1.0 / math.sqrt(embedding_size)))
with tf.name_scope('biases'):
nce_biases = tf.Variable(tf.zeros([vocabulary_size]))
with tf.name_scope('loss'):
loss = tf.reduce_mean(
tf.nn.nce_loss(
weights=nce_weights,
biases=nce_biases,
labels=train_labels,
inputs=embed,
num_sampled=num_sampled,
num_classes=vocabulary_size))
tf.summary.scalar('loss', loss)
if mode == "train":
with tf.name_scope('optimizer'):
optimizer = tf.train.GradientDescentOptimizer(1.0).minimize(loss)
return tf.estimator.EstimatorSpec(mode, loss=loss, train_op=optimizer)
这里是我调用估算器和训练的地方
word2vecEstimator = tf.estimator.Estimator(
model_fn=my_model,
params={
'batch_size': 16,
'embedding_size': 10,
'num_inputs': 3,
'num_sampled': 128,
'batch_size': 16
})
word2vecEstimator.train(
input_fn=generate_batch,
steps=10)
这是调用 Estimator 训练时收到的错误消息:
INFO:tensorflow:Calling model_fn.
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-22-955f44867ee5> in <module>()
1 word2vecEstimator.train(
2 input_fn=generate_batch,
----> 3 steps=10)
/usr/local/lib/python3.6/dist-packages/tensorflow/python/estimator/estimator.py in train(self, input_fn, hooks, steps, max_steps, saving_listeners)
352
353 saving_listeners = _check_listeners_type(saving_listeners)
--> 354 loss = self._train_model(input_fn, hooks, saving_listeners)
355 logging.info('Loss for final step: %s.', loss)
356 return self
/usr/local/lib/python3.6/dist-packages/tensorflow/python/estimator/estimator.py in _train_model(self, input_fn, hooks, saving_listeners)
1205 return self._train_model_distributed(input_fn, hooks, saving_listeners)
1206 else:
-> 1207 return self._train_model_default(input_fn, hooks, saving_listeners)
1208
1209 def _train_model_default(self, input_fn, hooks, saving_listeners):
/usr/local/lib/python3.6/dist-packages/tensorflow/python/estimator/estimator.py in _train_model_default(self, input_fn, hooks, saving_listeners)
1235 worker_hooks.extend(input_hooks)
1236 estimator_spec = self._call_model_fn(
-> 1237 features, labels, model_fn_lib.ModeKeys.TRAIN, self.config)
1238 global_step_tensor = training_util.get_global_step(g)
1239 return self._train_with_estimator_spec(estimator_spec, worker_hooks,
/usr/local/lib/python3.6/dist-packages/tensorflow/python/estimator/estimator.py in _call_model_fn(self, features, labels, mode, config)
1193
1194 logging.info('Calling model_fn.')
-> 1195 model_fn_results = self._model_fn(features=features, **kwargs)
1196 logging.info('Done calling model_fn.')
1197
<ipython-input-20-9d389437162a> in my_model(features, labels, mode, params)
33 inputs=embed,
34 num_sampled=num_sampled,
---> 35 num_classes=vocabulary_size))
36
37 # Add the loss value as a scalar to summary.
/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/nn_impl.py in nce_loss(weights, biases, labels, inputs, num_sampled, num_classes, num_true, sampled_values, remove_accidental_hits, partition_strategy, name)
1246 remove_accidental_hits=remove_accidental_hits,
1247 partition_strategy=partition_strategy,
-> 1248 name=name)
1249 sampled_losses = sigmoid_cross_entropy_with_logits(
1250 labels=labels, logits=logits, name="sampled_losses")
/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/nn_impl.py in _compute_sampled_logits(weights, biases, labels, inputs, num_sampled, num_classes, num_true, sampled_values, subtract_log_q, remove_accidental_hits, partition_strategy, name, seed)
1029 with ops.name_scope(name, "compute_sampled_logits",
1030 weights + [biases, inputs, labels]):
-> 1031 if labels.dtype != dtypes.int64:
1032 labels = math_ops.cast(labels, dtypes.int64)
1033 labels_flat = array_ops.reshape(labels, [-1])
TypeError: data type not understood
编辑:根据要求,input_fn 的典型输出如下所示
print(generate_batch(batch_size=8, num_skips=2, skip_window=1))
(array([3081, 3081, 12, 12, 6, 6, 195, 195], dtype=int32), array([[5234],
[ 12],
[ 6],
[3081],
[ 12],
[ 195],
[ 6],
[ 2]], dtype=int32))
您在此处将 generate_batch
用作变量:
word2vecEstimator.train(
input_fn=generate_batch,
steps=10)
用generate_batch()
调用函数。
但我认为你必须将一些值传递给函数。
可能张量和ops必须在input_fn
,而不是在'model_fn'
I found this issue #4026 which solved my problem ... Maybe it is just me being stupid, but it would be great if you mention that the tensors and ops all have to be inside the input_fn somewhere in the documentation.
You have to call read_batch_examples from somewhere inside input_fn so that the tensors it creates are in the graph that Estimator creates in fit().
https://github.com/tensorflow/tensorflow/issues/8042
Oh I feel like an idiot! I've been creating the op outside of the graph scope. It works now, can't believe I didn't think to try that. Thanks a lot! This is a non-issue and has been resolved
https://github.com/tensorflow/tensorflow/issues/4026
但是,仍然没有足够的信息来说明导致问题的原因。这只是一个线索。
找到答案
Error clearly says you have invalid type for labels.
You trying to pass numpy array instead of Tensor. Sometimes Tensorflow performs implicit conversion from ndarray to Tensor under the hood (what's why your code works outside of Estimator), but in this case it don't.
.
No, official impl. feeds data from a placeholder. Placeholder is always a Tensor, so it don't depends on implicit things.
But if you directly call loss function with a numpy array as input (Notice: call during graph construction phase, so argument content gets embedded into graph), it MAY work (however, I did not check it).
This code:
nce_loss(labels=[1,2,3]) will be called only ONCE during graph construction. Labels will be statically embedded into graph as a constant and potentially can be of any Tensor-compatible type (list, ndarray, etc)
This code: ```Python def model(label_input): nce_loss(labels=label_input)
estimator(model_fun=model).train() ``` can't embed labels variable statically, because it content is not defined during graph construction. So if you feed anything except the Tensor, it will throw an error.
来自
所以我使用了 labels=tf.dtypes.cast( train_labels, tf.int64)
并且有效