NLP 的 MITIE 库

MITIE library for NLP

我想了解 MITIE 是如何与 Rasa 集成的。我想知道 MITIE 文件 total_word_feature_extractor.dat 到底包含什么?我找不到任何关于此的好文档。

谢谢!

如果您在 MITIE repo's on Github you can find your answer. For example here is a bit of information 中足够深入地了解该文件中的内容。

As for what's inside, yes, it's a variant of word2vec based on the two step CCA method from this paper: http://icml.cc/2012/papers/763.pdf. I also upgraded it to include something that is similar to the CCA method but works on out of sample words by analyzing their morphology to produce a word vector. This significantly improved the results on datasets containing lots of words not in the original dictionary.

就 MITIE 如何集成到 Rasa 而言,它是 few backend choices for Rasa. It provides a few pipeline components 可以同时进行意图分类和 NER 的一种。两者都使用 SVM 并使用 total_word_feature_extractor.dat 来提供单独的词向量。