在 Keras 中使用 LSTM 获取用于文本分类的单词的概率

Get the probability of a word for text classification with LSTM in Keras

我正在使用 LSTM 和 Keras 进行情感分类,我想获得 LSTM 分配给句子中每个单词的概率,以便了解哪些单词更具有代表性。

例如,对于下面的句子:

"This landscape is wonderful and calming"

我认为将句子归类为正面最有代表性的词是"wonderful"和"calming"个词。

如何获得 LSTM 分配给每个单词的概率?

lstm_layer = layers.LSTM(size)(embedding_layer)

output_layer1 = layers.Dense(50, activation=activation)(lstm_layer)
output_layer1 = layers.Dropout(0.25)(output_layer1)
output_layer2 = layers.Dense(1, activation="sigmoid")(output_layer1)

model = models.Model(inputs=input_layer, outputs=output_layer2)
model.compile(optimizer=optimizer, loss='binary_crossentropy')

谢谢

您可以从最后一层(具有 softmax 的密集层)获得概率。示例模型:

import keras
import keras.layers as L

# instantiate sequential model
model = keras.models.Sequential()

# define input layer
model.add(L.InputLayer([None], dtype='int32'))

# define embedding layer for dictionary size of 'len(all_words)' and 50 features/units
model.add(L.Embedding(len(all_words), 50))

# define fully-connected RNN with 64 output units. Crucially: we return the outputs of the RNN for every time step instead of just the last time step
model.add(L.SimpleRNN(64, return_sequences=True))

# define dense layer of 'len(all_words)' outputs and softmax activation
# this will produce a vector of size len(all_words)
stepwise_dense = L.Dense(len(all_words), activation='softmax')

# The TimeDistributed layer adds a time dimension to the Dense layer so that it applies across the time dimension for every batch
# That is, TimeDistributed applies the Dense layer to each time-step (input word) independently. Without it, the Dense layer would apply only once to all of the time-steps concatenated.
# So, for the given time step (input word), each element 'i' in the output vector is the probability of the ith word from the target dictionary
stepwise_dense = L.TimeDistributed(stepwise_dense)
model.add(stepwise_dense)

然后,编译并拟合(训练)您的模型:

model.compile('adam','categorical_crossentropy')

model.fit_generator(generate_batches(train_data),len(train_data)/BATCH_SIZE,
                    callbacks=[EvaluateAccuracy()], epochs=5,)

最后-只需使用预测函数即可获得概率:

model.predict(input_to_your_network)

需要说明的是,softmax层的第i个输出单元表示第i个class (also see here).