Tensorflow 2 Glove 无法广播输入数组无法准备嵌入矩阵但不能 +1

Tensorflow 2 Glove could not broadcast input array Can't prepare the embedding matrix but not +1

我得到一个ValueError: could not broadcast input array from shape (50) into shape (100)准备嵌入矩阵我已经加载了glove并将单词制作成vec找到了400000个词向量。

我确实看过一堆类似的问题,但是 他们似乎都在处理忘记在最大字数中添加 +1,我想我已经涵盖了,但是仍然有问题。非常感谢任何帮助。

num_words = min(MAX_NUM_WORDS, len(word2idx_inputs) + 1)

我也试过了

num_words = min(MAX_NUM_WORDS, len(word2idx_inputs)) + 1

这个我也试过了

Keras word embeddings Glove: can't prepare the embedding matrix

但也是 +1 问题

仅供参考:由于将他加禄语翻译成英语,这是第一次做 Seq to seq to 的极端新手

收到的错误


Filling pre-trained embeddings...

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-16-acf0d8a4c4ca> in <module>
     8     if embedding_vector is not None:
     9       # words not found in embedding index will be all zeros.
---> 10       embedding_matrix[i] = embedding_vector
    11 
    12 # create embedding layer

ValueError: could not broadcast input array from shape (50) into shape (100)

代码


# prepare embedding matrix
print('Filling pre-trained embeddings...')
num_words = min(MAX_NUM_WORDS, len(word2idx_inputs) + 1)
embedding_matrix = np.zeros((num_words, EMBEDDING_DIM))
for word, i in word2idx_inputs.items():
 if i < MAX_NUM_WORDS:
   embedding_vector = word2vec.get(word)
   if embedding_vector is not None:
     # words not found in embedding index will be all zeros.
     embedding_matrix[i] = embedding_vector

# create embedding layer
embedding_layer = Embedding(
 num_words,
 EMBEDDING_DIM,
 weights=[embedding_matrix],
 input_length=max_len_input,
 # trainable=True
)

# create targets, since we cannot use sparse
# categorical cross entropy when we have sequences
decoder_targets_one_hot = np.zeros(
 (
   len(input_texts),
   max_len_target,
   num_words_output
 ),
 dtype='float32'
)

# assign the values
for i, d in enumerate(decoder_targets):
 for t, word in enumerate(d):
   if word != 0:
     decoder_targets_one_hot[i, t, word] = 1


检查EMBEDDING_DIM值,可能预训练的数据限制较少, 因为错误显示 shape(50) 变成 shape(100)。 所以设EMBEDDING_DIM=50.