使用 BERT 进行独特词的特征提取

Use BERT for feature extraction of a unique word

我正在使用 BERT 对给定文本出现的单词进行特征提取，但似乎目前在 bert 的官方 github (https://github.com/google-research/bert) 中的实现只能计算所有的特征文本中的单词，这使得它消耗过多的资源。是否可以为此目的对其进行调整？谢谢！！

BERT 不是上下文无关转换器，这意味着您不想像使用 word2vec 那样将它用于单个单词。这真的很重要——您想将您的输入置于上下文中。我的意思是你可以输入一个单词的句子，但为什么不直接使用 word2vec。

这是自述文件的内容：

Pre-trained representations can also either be context-free or contextual, and contextual representations can further be unidirectional or bidirectional. Context-free models such as word2vec or GloVe generate a single "word embedding" representation for each word in the vocabulary, so bank would have the same representation in bank deposit and river bank. Contextual models instead generate a representation of each word that is based on the other words in the sentence.

希望有道理:-)

使用 BERT 进行独特词的特征提取

Use BERT for feature extraction of a unique word

python

nlp

language-model

tensorflow