Google Cloud NLP 的嵌套命名实体识别

Nested Named Entity Recognition with Google Cloud NLP

我们可以通过上传pdf完整文档、标记简单实体和训练来进行简单命名实体识别。

But, does Google Cloud AutoML platform support Nested Named Entity Recognition ?

默认情况下没有。据我所知，也不一定有实现嵌套命名实体识别的标准化方法，这可能是它不受支持的部分原因。我想在单个进程中执行此操作，每个注释都需要在其中包含多个注释，这是不可能的：

Each annotation can cover up to ten tokens (words). They cannot overlap; the start_offset of an annotation cannot be between the start_offset and end_offset of an annotation in the same document. [docs]

但是，您可以根据您对嵌套 NER 的理解自行实现。训练通用模型以提取主要实体（较大的包含实体）。然后，训练辅助模型以提取辅助实体（主要实体内的实体）。运行次要模型仅基于主要模型的输出。您可能还应该实施一些条件，例如令牌数量。

Google Cloud NLP 的嵌套命名实体识别

Nested Named Entity Recognition with Google Cloud NLP

named-entity-recognition

google-cloud-platform

google-cloud-automl