使用词嵌入和 TFIDF 向量的 LSTM
LSTM using word embeddings and TFIDF vectors
我正尝试在具有文本属性和 TFIDF 向量的数据集上 运行 LSTM。我将文本和输入词嵌入到 LSTM 层。接下来,我连接 LSTM 输出和 TFIDF 向量。但是,下面代码中的第 2 行会引发以下错误:
"ValueError: Layer lstm_1 was called with an input that isn't a symbolic tensor. Received type: . Full input: []. All inputs to the layer should be tensors."
代码如下,其中len(term_Index)+1 = 9891,emb_Dim=100,emb_Mat包含浮点数,形状为[9891,100],并且sen_Len=1000:
embed = Embedding(len(term_Index) + 1, emb_Dim, weights=[emb_Mat],
input_length=sen_Len, trainable=False)
lstm = LSTM(60, dropout=0.1, recurrent_dropout=0.1)(embed)
tfidf_i = Input(shape=(max_terms_art,))
conc = Concatenate()(lstm, tfidf_i)
drop = Dropout(0.2)(conc)
dens = Dense(1)(drop)
acti = Activation('sigmoid')(dens)
model = Model([embed, tfidf_i], acti)
model.compile(optimizer='adam', loss='binary_crossentropy', metrics = ['accuracy'])
history = model.fit([features_Train, TFIDF_Train], target_Train, epochs = 50, batch_size=128, validation_split=0.20)
看来我无法重现你的错误。我加上括号后,代码运行就完美了。请参阅下面的代码:
from tensorflow.keras.layers import Input, Embedding, LSTM, Concatenate, Dropout, Dense, Activation
from tensorflow.keras import Model
import tensorflow as tf
import numpy as np
emb_Mat = tf.random.normal((9891,100)).numpy()
term_Index = tf.random.uniform((9890,)).numpy()
sen_Len=1000
emb_Dim=100
max_terms_art=500
inp = Input(shape=(len(term_Index),))
embed = Embedding(len(term_Index) + 1, emb_Dim, weights=[emb_Mat], input_length=sen_Len, trainable=False)(inp)
lstm = LSTM(60, dropout=0.1, recurrent_dropout=0.1)(embed)
tfidf_i = Input(shape=(max_terms_art,))
conc = Concatenate()([lstm, tfidf_i])
drop = Dropout(0.2)(conc)
dens = Dense(1)(drop)
acti = Activation('sigmoid')(dens)
Model([inp, tfidf_i], acti).summary()
输出:
Model: "model_2"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_16 (InputLayer) [(None, 9890)] 0
__________________________________________________________________________________________________
embedding_15 (Embedding) (None, 9890, 100) 989100 input_16[0][0]
__________________________________________________________________________________________________
lstm_8 (LSTM) (None, 60) 38640 embedding_15[0][0]
__________________________________________________________________________________________________
input_17 (InputLayer) [(None, 500)] 0
__________________________________________________________________________________________________
concatenate_2 (Concatenate) (None, 560) 0 lstm_8[0][0]
input_17[0][0]
__________________________________________________________________________________________________
dropout_1 (Dropout) (None, 560) 0 concatenate_2[0][0]
__________________________________________________________________________________________________
dense_1 (Dense) (None, 1) 561 dropout_1[0][0]
__________________________________________________________________________________________________
activation_1 (Activation) (None, 1) 0 dense_1[0][0]
==================================================================================================
Total params: 1,028,301
Trainable params: 39,201
Non-trainable params: 989,100
__________________________________________________________________________________________________
我正尝试在具有文本属性和 TFIDF 向量的数据集上 运行 LSTM。我将文本和输入词嵌入到 LSTM 层。接下来,我连接 LSTM 输出和 TFIDF 向量。但是,下面代码中的第 2 行会引发以下错误:
"ValueError: Layer lstm_1 was called with an input that isn't a symbolic tensor. Received type: . Full input: []. All inputs to the layer should be tensors."
代码如下,其中len(term_Index)+1 = 9891,emb_Dim=100,emb_Mat包含浮点数,形状为[9891,100],并且sen_Len=1000:
embed = Embedding(len(term_Index) + 1, emb_Dim, weights=[emb_Mat],
input_length=sen_Len, trainable=False)
lstm = LSTM(60, dropout=0.1, recurrent_dropout=0.1)(embed)
tfidf_i = Input(shape=(max_terms_art,))
conc = Concatenate()(lstm, tfidf_i)
drop = Dropout(0.2)(conc)
dens = Dense(1)(drop)
acti = Activation('sigmoid')(dens)
model = Model([embed, tfidf_i], acti)
model.compile(optimizer='adam', loss='binary_crossentropy', metrics = ['accuracy'])
history = model.fit([features_Train, TFIDF_Train], target_Train, epochs = 50, batch_size=128, validation_split=0.20)
看来我无法重现你的错误。我加上括号后,代码运行就完美了。请参阅下面的代码:
from tensorflow.keras.layers import Input, Embedding, LSTM, Concatenate, Dropout, Dense, Activation
from tensorflow.keras import Model
import tensorflow as tf
import numpy as np
emb_Mat = tf.random.normal((9891,100)).numpy()
term_Index = tf.random.uniform((9890,)).numpy()
sen_Len=1000
emb_Dim=100
max_terms_art=500
inp = Input(shape=(len(term_Index),))
embed = Embedding(len(term_Index) + 1, emb_Dim, weights=[emb_Mat], input_length=sen_Len, trainable=False)(inp)
lstm = LSTM(60, dropout=0.1, recurrent_dropout=0.1)(embed)
tfidf_i = Input(shape=(max_terms_art,))
conc = Concatenate()([lstm, tfidf_i])
drop = Dropout(0.2)(conc)
dens = Dense(1)(drop)
acti = Activation('sigmoid')(dens)
Model([inp, tfidf_i], acti).summary()
输出:
Model: "model_2"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_16 (InputLayer) [(None, 9890)] 0
__________________________________________________________________________________________________
embedding_15 (Embedding) (None, 9890, 100) 989100 input_16[0][0]
__________________________________________________________________________________________________
lstm_8 (LSTM) (None, 60) 38640 embedding_15[0][0]
__________________________________________________________________________________________________
input_17 (InputLayer) [(None, 500)] 0
__________________________________________________________________________________________________
concatenate_2 (Concatenate) (None, 560) 0 lstm_8[0][0]
input_17[0][0]
__________________________________________________________________________________________________
dropout_1 (Dropout) (None, 560) 0 concatenate_2[0][0]
__________________________________________________________________________________________________
dense_1 (Dense) (None, 1) 561 dropout_1[0][0]
__________________________________________________________________________________________________
activation_1 (Activation) (None, 1) 0 dense_1[0][0]
==================================================================================================
Total params: 1,028,301
Trainable params: 39,201
Non-trainable params: 989,100
__________________________________________________________________________________________________