无法将嵌入层与 tf.distribute.MirroredStrategy 一起使用
Not able to use Embedding Layer with tf.distribute.MirroredStrategy
我正在尝试在 tensorflow 版本 2.4.1 上并行化带有嵌入层的模型。但它抛出以下错误:
InvalidArgumentError: Cannot assign a device for operation sequential/emb_layer/embedding_lookup/ReadVariableOp: Could not satisfy explicit device specification '' because the node {{colocation_node sequential/emb_layer/embedding_lookup/ReadVariableOp}} was colocated with a group of nodes that required incompatible device '/job:localhost/replica:0/task:0/device:GPU:0'. All available devices [/job:localhost/replica:0/task:0/device:CPU:0, /job:localhost/replica:0/task:0/device:XLA_CPU:0, /job:localhost/replica:0/task:0/device:XLA_GPU:0, /job:localhost/replica:0/task:0/device:GPU:0].
Colocation Debug Info:
Colocation group had the following types and supported devices:
Root Member(assigned_device_name_index_=2 requested_device_name_='/job:localhost/replica:0/task:0/device:GPU:0' assigned_device_name_='/job:localhost/replica:0/task:0/device:GPU:0' resource_device_name_='/job:localhost/replica:0/task:0/device:GPU:0' supported_device_types_=[CPU] possible_devices_=[]
GatherV2: GPU CPU XLA_CPU XLA_GPU
Cast: GPU CPU XLA_CPU XLA_GPU
Const: GPU CPU XLA_CPU XLA_GPU
ResourceSparseApplyAdagradV2: CPU
_Arg: GPU CPU XLA_CPU XLA_GPU
ReadVariableOp: GPU CPU XLA_CPU XLA_GPU
Colocation members, user-requested devices, and framework assigned devices, if any:
sequential_emb_layer_embedding_lookup_readvariableop_resource (_Arg) framework assigned device=/job:localhost/replica:0/task:0/device:GPU:0
adagrad_adagrad_update_update_0_resourcesparseapplyadagradv2_accum (_Arg) framework assigned device=/job:localhost/replica:0/task:0/device:GPU:0
sequential/emb_layer/embedding_lookup/ReadVariableOp (ReadVariableOp)
sequential/emb_layer/embedding_lookup/axis (Const)
sequential/emb_layer/embedding_lookup (GatherV2)
gradient_tape/sequential/emb_layer/embedding_lookup/Shape (Const)
gradient_tape/sequential/emb_layer/embedding_lookup/Cast (Cast)
Adagrad/Adagrad/update/update_0/ResourceSparseApplyAdagradV2 (ResourceSparseApplyAdagradV2) /job:localhost/replica:0/task:0/device:GPU:0
[[{{node sequential/emb_layer/embedding_lookup/ReadVariableOp}}]] [Op:__inference_train_function_631]
将模型简化为基本模型以使其可重现:
import tensorflow as tf
central_storage_strategy = tf.distribute.MirroredStrategy()
with central_storage_strategy.scope():
user_model = tf.keras.Sequential([
tf.keras.layers.Embedding(10, 2, name = "emb_layer")
])
user_model.compile(optimizer=tf.keras.optimizers.Adagrad(0.1), loss="mse")
user_model.fit([1],[[1,2]], epochs=3)
任何帮助将不胜感激。谢谢!
所以我终于弄明白了这个问题,如果有人正在寻找答案。
Tensorflow 目前没有 Adagrad 优化器的完整 GPU 实现。 ResourceSparseApplyAdagradV2 操作在 GPU 上给出错误,这是嵌入层不可或缺的一部分。所以它不能与具有数据并行策略的嵌入层一起使用。使用 Adam 或 rmsprop 工作正常。
我正在尝试在 tensorflow 版本 2.4.1 上并行化带有嵌入层的模型。但它抛出以下错误:
InvalidArgumentError: Cannot assign a device for operation sequential/emb_layer/embedding_lookup/ReadVariableOp: Could not satisfy explicit device specification '' because the node {{colocation_node sequential/emb_layer/embedding_lookup/ReadVariableOp}} was colocated with a group of nodes that required incompatible device '/job:localhost/replica:0/task:0/device:GPU:0'. All available devices [/job:localhost/replica:0/task:0/device:CPU:0, /job:localhost/replica:0/task:0/device:XLA_CPU:0, /job:localhost/replica:0/task:0/device:XLA_GPU:0, /job:localhost/replica:0/task:0/device:GPU:0].
Colocation Debug Info:
Colocation group had the following types and supported devices:
Root Member(assigned_device_name_index_=2 requested_device_name_='/job:localhost/replica:0/task:0/device:GPU:0' assigned_device_name_='/job:localhost/replica:0/task:0/device:GPU:0' resource_device_name_='/job:localhost/replica:0/task:0/device:GPU:0' supported_device_types_=[CPU] possible_devices_=[]
GatherV2: GPU CPU XLA_CPU XLA_GPU
Cast: GPU CPU XLA_CPU XLA_GPU
Const: GPU CPU XLA_CPU XLA_GPU
ResourceSparseApplyAdagradV2: CPU
_Arg: GPU CPU XLA_CPU XLA_GPU
ReadVariableOp: GPU CPU XLA_CPU XLA_GPU
Colocation members, user-requested devices, and framework assigned devices, if any:
sequential_emb_layer_embedding_lookup_readvariableop_resource (_Arg) framework assigned device=/job:localhost/replica:0/task:0/device:GPU:0
adagrad_adagrad_update_update_0_resourcesparseapplyadagradv2_accum (_Arg) framework assigned device=/job:localhost/replica:0/task:0/device:GPU:0
sequential/emb_layer/embedding_lookup/ReadVariableOp (ReadVariableOp)
sequential/emb_layer/embedding_lookup/axis (Const)
sequential/emb_layer/embedding_lookup (GatherV2)
gradient_tape/sequential/emb_layer/embedding_lookup/Shape (Const)
gradient_tape/sequential/emb_layer/embedding_lookup/Cast (Cast)
Adagrad/Adagrad/update/update_0/ResourceSparseApplyAdagradV2 (ResourceSparseApplyAdagradV2) /job:localhost/replica:0/task:0/device:GPU:0
[[{{node sequential/emb_layer/embedding_lookup/ReadVariableOp}}]] [Op:__inference_train_function_631]
将模型简化为基本模型以使其可重现:
import tensorflow as tf
central_storage_strategy = tf.distribute.MirroredStrategy()
with central_storage_strategy.scope():
user_model = tf.keras.Sequential([
tf.keras.layers.Embedding(10, 2, name = "emb_layer")
])
user_model.compile(optimizer=tf.keras.optimizers.Adagrad(0.1), loss="mse")
user_model.fit([1],[[1,2]], epochs=3)
任何帮助将不胜感激。谢谢!
所以我终于弄明白了这个问题,如果有人正在寻找答案。
Tensorflow 目前没有 Adagrad 优化器的完整 GPU 实现。 ResourceSparseApplyAdagradV2 操作在 GPU 上给出错误,这是嵌入层不可或缺的一部分。所以它不能与具有数据并行策略的嵌入层一起使用。使用 Adam 或 rmsprop 工作正常。