预训练嵌入层:tf.constant 具有不受支持的形状

Pre-trained embedding layer: tf.constant with unsupported shape

我将在 Keras 模型中使用预训练词嵌入。我的矩阵权重存储在 ;matrix.w2v.wv.vectors.npy;它的形状为 (150854, 100)。

现在我在Keras模型中添加不同参数的embedding层时如下:

model.add(Embedding(5000, 100,
    embeddings_initializer=keras.initializers.Constant(emb_matrix),
    input_length=875, trainable=False))

我收到以下错误:

---------------------------------------------------------------------------
TypeError                         Traceback (most recent call last)
<ipython-input-61-8731e904e60a> in <module>()
  1 model = Sequential()
  2 
----> 3 model.add(Embedding(5000,100,
   embeddings_initializer=keras.initializers.Constant(emb_matrix),
   input_length=875,trainable=False))
  4 model.add(Conv1D(128, 10, padding='same', activation='relu'))
  5 model.add(MaxPooling1D(10))

  22 frames
 
 /usr/local/lib/python3.7/dist- 
 packages/tensorflow/python/framework/constant_op.py in 
_constant_eager_impl(ctx, value, dtype, shape, verify_shape)
  323   raise TypeError("Eager execution of tf.constant with unsupported shape 
             "
  324                   "(value has %d elements, shape is %s with %d 
                        elements)." %
--> 325                   (num_t, shape, shape.num_elements()))
  326 
  327 

  TypeError: Eager execution of tf.constant with unsupported shape (value has 
  15085400 elements, shape is (5000, 100) with 500000 elements).

请告诉我哪里做错了。

您的嵌入层需要 5,000 个单词的词汇表,并初始化形状为 5000×100 的嵌入矩阵。然而。您尝试加载的 word2vec 模型的词汇量为 150,854 个单词。

您要么需要增加嵌入层的容量,要么截断嵌入矩阵以仅允许出现频率最高的单词。