使用 GPU 的 tensorflow 教程中无法 运行 词嵌入示例
Fail to run word embedding example in tensorflow tutorial with GPUs
我正在尝试 运行 https://github.com/tensorflow/tensorflow/tree/master/tensorflow/g3doc/tutorials/word2vec 处的词嵌入示例代码(在 Ubuntu 14.04 下安装了 GPU 版本的 tensorflow),但它 returns 以下错误信息:
Found and verified text8.zip
Data size 17005207
Most common words (+UNK) [['UNK', 418391], ('the', 1061396), ('of', 593677), ('and', 416629), ('one', 411764)]
Sample data [5239, 3084, 12, 6, 195, 2, 3137, 46, 59, 156]
3084 -> 12
originated -> as
3084 -> 5239
originated -> anarchism
12 -> 3084
as -> originated
12 -> 6
as -> a
6 -> 12
a -> as
6 -> 195
a -> term
195 -> 6
term -> a
195 -> 2
term -> of
I tensorflow/core/common_runtime/local_device.cc:25] Local device intra op parallelism threads: 12
I tensorflow/core/common_runtime/gpu/gpu_init.cc:88] Found device 0 with properties:
name: GeForce GTX TITAN X
major: 5 minor: 2 memoryClockRate (GHz) 1.076
pciBusID 0000:03:00.0
Total memory: 12.00GiB
Free memory: 443.32MiB
I tensorflow/core/common_runtime/gpu/gpu_init.cc:88] Found device 1 with properties:
name: GeForce GTX TITAN X
major: 5 minor: 2 memoryClockRate (GHz) 1.076
pciBusID 0000:05:00.0
Total memory: 12.00GiB
Free memory: 451.61MiB
I tensorflow/core/common_runtime/gpu/gpu_init.cc:112] DMA: 0 1
I tensorflow/core/common_runtime/gpu/gpu_init.cc:122] 0: Y Y
I tensorflow/core/common_runtime/gpu/gpu_init.cc:122] 1: Y Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:643] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX TITAN X, pci bus id: 0000:03:00.0)
I tensorflow/core/common_runtime/gpu/gpu_device.cc:643] Creating TensorFlow device (/gpu:1) -> (device: 1, name: GeForce GTX TITAN X, pci bus id: 0000:05:00.0)
I tensorflow/core/common_runtime/gpu/gpu_region_allocator.cc:47] Setting region size to 254881792
I tensorflow/core/common_runtime/gpu/gpu_region_allocator.cc:47] Setting region size to 263835648
I tensorflow/core/common_runtime/local_session.cc:45] Local session inter op parallelism threads: 12
Initialized
Traceback (most recent call last):
File "word2vec_basic.py", line 171, in <module>
_, loss_val = session.run([optimizer, loss], feed_dict=feed_dict)
File "/home/chentingpc/anaconda/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 345, in run
results = self._do_run(target_list, unique_fetch_targets, feed_dict_string)
File "/home/chentingpc/anaconda/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 419, in _do_run
e.code)
tensorflow.python.framework.errors.InvalidArgumentError: Cannot assign a device to node 'GradientDescent/update_Variable_2/ScatterSub': Could not satisfy explicit device specification '' because the node was colocated with a group of nodes that required incompatible device '/job:localhost/replica:0/task:0/GPU:0'
[[Node: GradientDescent/update_Variable_2/ScatterSub = ScatterSub[T=DT_FLOAT, Tindices=DT_INT64, use_locking=false](Variable_2, gradients/concat_1, GradientDescent/update_Variable_2/mul)]]
Caused by op u'GradientDescent/update_Variable_2/ScatterSub', defined at:
File "word2vec_basic.py", line 145, in <module>
optimizer = tf.train.GradientDescentOptimizer(1.0).minimize(loss)
File "/home/chentingpc/anaconda/lib/python2.7/site-packages/tensorflow/python/training/optimizer.py", line 167, in minimize
name=name)
File "/home/chentingpc/anaconda/lib/python2.7/site-packages/tensorflow/python/training/optimizer.py", line 256, in apply_gradients
update_ops.append(self._apply_sparse(grad, var))
File "/home/chentingpc/anaconda/lib/python2.7/site-packages/tensorflow/python/training/gradient_descent.py", line 40, in _apply_sparse
return var.scatter_sub(delta, use_locking=self._use_locking)
File "/home/chentingpc/anaconda/lib/python2.7/site-packages/tensorflow/python/ops/variables.py", line 324, in scatter_sub
use_locking=use_locking)
File "/home/chentingpc/anaconda/lib/python2.7/site-packages/tensorflow/python/ops/gen_state_ops.py", line 227, in scatter_sub
name=name)
File "/home/chentingpc/anaconda/lib/python2.7/site-packages/tensorflow/python/ops/op_def_library.py", line 633, in apply_op
op_def=op_def)
File "/home/chentingpc/anaconda/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1710, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/home/chentingpc/anaconda/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 988, in __init__
self._traceback = _extract_stack()
当我 运行 CPU 版本的 tensorflow 中的代码时,它工作得很好。但不适用于 GPU 版本。我还尝试使用 tf.device('/cpu:0') 强制使用 CUP 而不是 GPU,但它产生相同的输出。
这个例子中有什么函数不能在GPU中运行吗?由于 tf.device('/cpu:0') 不工作,我如何在不重新安装 CPU 版本的 tensorflow 的情况下切换到 CPU?
- scatter_sub 仅在当前版本的 cpu 上受支持。
- 我希望在 Ln119 上添加:使用 tf.device("/cpu:0") 应该强制所有内容都使用 cpus。你是如何使用tf.device的?
GPU 似乎不支持此示例中使用的一大堆操作。一个快速的解决方法是限制操作,使得只有矩阵乘积在 GPU 上是 运行。
文档中有一个示例:http://tensorflow.org/api_docs/python/framework.md
参见 tf.Graph.device(device_name_or_function)
部分
我能够通过以下方式让它工作:
def device_for_node(n):
if n.type == "MatMul":
return "/gpu:0"
else:
return "/cpu:0"
with graph.as_default():
with graph.device(device_for_node):
...
我正在尝试 运行 https://github.com/tensorflow/tensorflow/tree/master/tensorflow/g3doc/tutorials/word2vec 处的词嵌入示例代码(在 Ubuntu 14.04 下安装了 GPU 版本的 tensorflow),但它 returns 以下错误信息:
Found and verified text8.zip
Data size 17005207
Most common words (+UNK) [['UNK', 418391], ('the', 1061396), ('of', 593677), ('and', 416629), ('one', 411764)]
Sample data [5239, 3084, 12, 6, 195, 2, 3137, 46, 59, 156]
3084 -> 12
originated -> as
3084 -> 5239
originated -> anarchism
12 -> 3084
as -> originated
12 -> 6
as -> a
6 -> 12
a -> as
6 -> 195
a -> term
195 -> 6
term -> a
195 -> 2
term -> of
I tensorflow/core/common_runtime/local_device.cc:25] Local device intra op parallelism threads: 12
I tensorflow/core/common_runtime/gpu/gpu_init.cc:88] Found device 0 with properties:
name: GeForce GTX TITAN X
major: 5 minor: 2 memoryClockRate (GHz) 1.076
pciBusID 0000:03:00.0
Total memory: 12.00GiB
Free memory: 443.32MiB
I tensorflow/core/common_runtime/gpu/gpu_init.cc:88] Found device 1 with properties:
name: GeForce GTX TITAN X
major: 5 minor: 2 memoryClockRate (GHz) 1.076
pciBusID 0000:05:00.0
Total memory: 12.00GiB
Free memory: 451.61MiB
I tensorflow/core/common_runtime/gpu/gpu_init.cc:112] DMA: 0 1
I tensorflow/core/common_runtime/gpu/gpu_init.cc:122] 0: Y Y
I tensorflow/core/common_runtime/gpu/gpu_init.cc:122] 1: Y Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:643] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX TITAN X, pci bus id: 0000:03:00.0)
I tensorflow/core/common_runtime/gpu/gpu_device.cc:643] Creating TensorFlow device (/gpu:1) -> (device: 1, name: GeForce GTX TITAN X, pci bus id: 0000:05:00.0)
I tensorflow/core/common_runtime/gpu/gpu_region_allocator.cc:47] Setting region size to 254881792
I tensorflow/core/common_runtime/gpu/gpu_region_allocator.cc:47] Setting region size to 263835648
I tensorflow/core/common_runtime/local_session.cc:45] Local session inter op parallelism threads: 12
Initialized
Traceback (most recent call last):
File "word2vec_basic.py", line 171, in <module>
_, loss_val = session.run([optimizer, loss], feed_dict=feed_dict)
File "/home/chentingpc/anaconda/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 345, in run
results = self._do_run(target_list, unique_fetch_targets, feed_dict_string)
File "/home/chentingpc/anaconda/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 419, in _do_run
e.code)
tensorflow.python.framework.errors.InvalidArgumentError: Cannot assign a device to node 'GradientDescent/update_Variable_2/ScatterSub': Could not satisfy explicit device specification '' because the node was colocated with a group of nodes that required incompatible device '/job:localhost/replica:0/task:0/GPU:0'
[[Node: GradientDescent/update_Variable_2/ScatterSub = ScatterSub[T=DT_FLOAT, Tindices=DT_INT64, use_locking=false](Variable_2, gradients/concat_1, GradientDescent/update_Variable_2/mul)]]
Caused by op u'GradientDescent/update_Variable_2/ScatterSub', defined at:
File "word2vec_basic.py", line 145, in <module>
optimizer = tf.train.GradientDescentOptimizer(1.0).minimize(loss)
File "/home/chentingpc/anaconda/lib/python2.7/site-packages/tensorflow/python/training/optimizer.py", line 167, in minimize
name=name)
File "/home/chentingpc/anaconda/lib/python2.7/site-packages/tensorflow/python/training/optimizer.py", line 256, in apply_gradients
update_ops.append(self._apply_sparse(grad, var))
File "/home/chentingpc/anaconda/lib/python2.7/site-packages/tensorflow/python/training/gradient_descent.py", line 40, in _apply_sparse
return var.scatter_sub(delta, use_locking=self._use_locking)
File "/home/chentingpc/anaconda/lib/python2.7/site-packages/tensorflow/python/ops/variables.py", line 324, in scatter_sub
use_locking=use_locking)
File "/home/chentingpc/anaconda/lib/python2.7/site-packages/tensorflow/python/ops/gen_state_ops.py", line 227, in scatter_sub
name=name)
File "/home/chentingpc/anaconda/lib/python2.7/site-packages/tensorflow/python/ops/op_def_library.py", line 633, in apply_op
op_def=op_def)
File "/home/chentingpc/anaconda/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1710, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/home/chentingpc/anaconda/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 988, in __init__
self._traceback = _extract_stack()
当我 运行 CPU 版本的 tensorflow 中的代码时,它工作得很好。但不适用于 GPU 版本。我还尝试使用 tf.device('/cpu:0') 强制使用 CUP 而不是 GPU,但它产生相同的输出。
这个例子中有什么函数不能在GPU中运行吗?由于 tf.device('/cpu:0') 不工作,我如何在不重新安装 CPU 版本的 tensorflow 的情况下切换到 CPU?
- scatter_sub 仅在当前版本的 cpu 上受支持。
- 我希望在 Ln119 上添加:使用 tf.device("/cpu:0") 应该强制所有内容都使用 cpus。你是如何使用tf.device的?
GPU 似乎不支持此示例中使用的一大堆操作。一个快速的解决方法是限制操作,使得只有矩阵乘积在 GPU 上是 运行。
文档中有一个示例:http://tensorflow.org/api_docs/python/framework.md
参见 tf.Graph.device(device_name_or_function)
部分我能够通过以下方式让它工作:
def device_for_node(n):
if n.type == "MatMul":
return "/gpu:0"
else:
return "/cpu:0"
with graph.as_default():
with graph.device(device_for_node):
...