将预训练词嵌入加载到 Tensorflow 模型中
Load pretrained word embedding into Tensorflow model
我正在尝试修改这个 Tensorflow LSTM model to load this pre-trained GoogleNews word ebmedding GoogleNews-vectors-negative300.bin(或者一个 tensorflow Word2Vec 嵌入也一样好)。
我一直在阅读有关如何将预训练词嵌入加载到 tensorflow 中的示例(例如 1: here, 2: here, and 4: here)。
在第一个链接示例中,他们可以轻松 assign the embedding to the graph:
sess.run(cnn.W.assign(initW))
在第二个链接示例中,他们 create an embedding-wrapper variable:
with tf.variable_scope("embedding_rnn_seq2seq/rnn/embedding_wrapper", reuse=True):
em_in = tf.get_variable("embedding")
然后他们 initialize the embedding wrapper:
sess.run(em_in.assign(initW))
这两个例子都有意义,但在我的例子中如何将解压的嵌入 initW 分配给 TF 图对我来说并不明显。 (我是TF新手)
我可以像前两个例子那样准备initW:
def loadEmbedding(self, word_to_id):
# New model, we load the pre-trained word2vec data and initialize embeddings
with open(os.path.join('GoogleNews-vectors-negative300.bin'), "rb", 0) as f:
header = f.readline()
vocab_size, vector_size = map(int, header.split())
binary_len = np.dtype('float32').itemsize * vector_size
initW = np.random.uniform(-0.25,0.25,(len(word_to_id), vector_size))
for line in range(vocab_size):
word = []
while True:
ch = f.read(1)
if ch == b' ':
word = b''.join(word).decode('utf-8')
break
if ch != b'\n':
word.append(ch)
if word in word_to_id:
initW[word_to_id[word]] = np.fromstring(f.read(binary_len), dtype='float32')
else:
f.read(binary_len)
return initW
根据 example 4 中的解决方案,我认为我应该可以做类似
的事情
session.run(tf.assign(embedding, initW)).
如果我尝试像这样在此处添加行 when the session is initialized :
with sv.managed_session() as session:
initializer = tf.random_uniform_initializer(-config.init_scale,
config.init_scale)
session.run(tf.assign(m.embedding, initW))
我收到以下错误:
ValueError: Fetch argument <tf.Tensor 'Assign:0' shape=(10000, 300) dtype=float32_ref> cannot be interpreted as a Tensor. (Tensor Tensor("Assign:0", shape=(10000, 300), dtype=float32_ref, device=/device:CPU:0) is not an element of this graph.)
更新:我根据 Nilesh Birari 的建议更新了代码:Full code。它不会改善验证或测试集的困惑度,只会改善训练集的困惑度。
以我对tensorflow的有限理解尝试回答,如有错误请指正。
ValueError: Fetch argument <tf.Tensor 'Assign:0' shape=(10000, 300) dtype=float32_ref> cannot be interpreted as a Tensor. (Tensor Tensor("Assign:0", shape=(10000, 300), dtype=float32_ref, device=/device:CPU:0) is not an element of this graph.)
这只是说明您正在尝试初始化不同图形的元素,所以我猜您需要在定义图形的相同范围内。只需在相同范围内调整您的嵌入初始化代码即可解决问题。
with tf.Graph().as_default():
initializer = tf.random_uniform_initializer(-config.init_scale,
config.init_scale)
with tf.name_scope("Train"):
train_input = PTBInput(config=config, data=train_data, name="TrainInput")
with tf.variable_scope("Model", reuse=None, initializer=initializer):
m = PTBModel(is_training=True, config=config, input_=train_input)
tf.summary.scalar("Training Loss", m.cost)
tf.summary.scalar("Learning Rate", m.lr)
with tf.name_scope("Valid"):
valid_input = PTBInput(config=config, data=valid_data, name="ValidInput")
with tf.variable_scope("Model", reuse=True, initializer=initializer):
mvalid = PTBModel(is_training=False, config=config, input_=valid_input)
tf.summary.scalar("Validation Loss", mvalid.cost)
with tf.name_scope("Test"):
test_input = PTBInput(config=eval_config, data=test_data, name="TestInput")
with tf.variable_scope("Model", reuse=True, initializer=initializer):
mtest = PTBModel(is_training=False, config=eval_config,
input_=test_input)
sv = tf.train.Supervisor(logdir=FLAGS.save_path)
with sv.managed_session() as session:
word2vec = loadEmbedding(word_to_id)
session.run(tf.assign(m.embedding, word2vec))
print("WORKED!!!")
我想这应该是唯一的问题,正如您在 first 示例中看到的那样,初始化在同一范围内。
我正在尝试修改这个 Tensorflow LSTM model to load this pre-trained GoogleNews word ebmedding GoogleNews-vectors-negative300.bin(或者一个 tensorflow Word2Vec 嵌入也一样好)。
我一直在阅读有关如何将预训练词嵌入加载到 tensorflow 中的示例(例如 1: here, 2: here,
在第一个链接示例中,他们可以轻松 assign the embedding to the graph:
sess.run(cnn.W.assign(initW))
在第二个链接示例中,他们 create an embedding-wrapper variable:
with tf.variable_scope("embedding_rnn_seq2seq/rnn/embedding_wrapper", reuse=True):
em_in = tf.get_variable("embedding")
然后他们 initialize the embedding wrapper:
sess.run(em_in.assign(initW))
这两个例子都有意义,但在我的例子中如何将解压的嵌入 initW 分配给 TF 图对我来说并不明显。 (我是TF新手)
我可以像前两个例子那样准备initW:
def loadEmbedding(self, word_to_id):
# New model, we load the pre-trained word2vec data and initialize embeddings
with open(os.path.join('GoogleNews-vectors-negative300.bin'), "rb", 0) as f:
header = f.readline()
vocab_size, vector_size = map(int, header.split())
binary_len = np.dtype('float32').itemsize * vector_size
initW = np.random.uniform(-0.25,0.25,(len(word_to_id), vector_size))
for line in range(vocab_size):
word = []
while True:
ch = f.read(1)
if ch == b' ':
word = b''.join(word).decode('utf-8')
break
if ch != b'\n':
word.append(ch)
if word in word_to_id:
initW[word_to_id[word]] = np.fromstring(f.read(binary_len), dtype='float32')
else:
f.read(binary_len)
return initW
根据 example 4 中的解决方案,我认为我应该可以做类似
的事情session.run(tf.assign(embedding, initW)).
如果我尝试像这样在此处添加行 when the session is initialized :
with sv.managed_session() as session:
initializer = tf.random_uniform_initializer(-config.init_scale,
config.init_scale)
session.run(tf.assign(m.embedding, initW))
我收到以下错误:
ValueError: Fetch argument <tf.Tensor 'Assign:0' shape=(10000, 300) dtype=float32_ref> cannot be interpreted as a Tensor. (Tensor Tensor("Assign:0", shape=(10000, 300), dtype=float32_ref, device=/device:CPU:0) is not an element of this graph.)
更新:我根据 Nilesh Birari 的建议更新了代码:Full code。它不会改善验证或测试集的困惑度,只会改善训练集的困惑度。
以我对tensorflow的有限理解尝试回答,如有错误请指正。
ValueError: Fetch argument <tf.Tensor 'Assign:0' shape=(10000, 300) dtype=float32_ref> cannot be interpreted as a Tensor. (Tensor Tensor("Assign:0", shape=(10000, 300), dtype=float32_ref, device=/device:CPU:0) is not an element of this graph.)
这只是说明您正在尝试初始化不同图形的元素,所以我猜您需要在定义图形的相同范围内。只需在相同范围内调整您的嵌入初始化代码即可解决问题。
with tf.Graph().as_default():
initializer = tf.random_uniform_initializer(-config.init_scale,
config.init_scale)
with tf.name_scope("Train"):
train_input = PTBInput(config=config, data=train_data, name="TrainInput")
with tf.variable_scope("Model", reuse=None, initializer=initializer):
m = PTBModel(is_training=True, config=config, input_=train_input)
tf.summary.scalar("Training Loss", m.cost)
tf.summary.scalar("Learning Rate", m.lr)
with tf.name_scope("Valid"):
valid_input = PTBInput(config=config, data=valid_data, name="ValidInput")
with tf.variable_scope("Model", reuse=True, initializer=initializer):
mvalid = PTBModel(is_training=False, config=config, input_=valid_input)
tf.summary.scalar("Validation Loss", mvalid.cost)
with tf.name_scope("Test"):
test_input = PTBInput(config=eval_config, data=test_data, name="TestInput")
with tf.variable_scope("Model", reuse=True, initializer=initializer):
mtest = PTBModel(is_training=False, config=eval_config,
input_=test_input)
sv = tf.train.Supervisor(logdir=FLAGS.save_path)
with sv.managed_session() as session:
word2vec = loadEmbedding(word_to_id)
session.run(tf.assign(m.embedding, word2vec))
print("WORKED!!!")
我想这应该是唯一的问题,正如您在 first 示例中看到的那样,初始化在同一范围内。