UIMA Ruta：通过在普通 Java 中组合现有注释的功能来创建新注释

Question

我正在尝试将以下逻辑转换为 UIMA Ruta 规则：

Sentence {->NewAnnotation} IF Sentence.part1 包含 Constituent.label="VB" AND Sentence.part2 包含 Constituent.label="VBZ"

换句话说，我需要从整个 Sentence 中创建一个新的注释，其特征为 part1( 和 part2) 包含特定 posTags (Constituent.label) 的 combinations/a 序列。

起初，我的直觉答案是按以下方式使用 CONTAINS 条件和 STRINGLIST（以及配置参数）：

STRINGLIST posList; //assuming it is declared
Sentence{-> NewAnnotation} <-{Sentence.part1{CONTAINS(posList, Constituent.label)};};

但它不会产生任何注释（但它不会失败）。

然后我通过将 Sentence 特征 (Sentence.part1) 存储在字符串变量中并单独使用它（在主要规则中）来考虑 GETFEATURE 操作。但是，由于 GETFEATURE 以 STRING 格式保存特征，所以我不能用它来生成注释（因为我需要 ANNOTATION 类型）。 MATCHEDTEXT 操作也是如此。

我知道想要构建的规则非常复杂，但我相信 Ruta 是此类任务的最合适选择。那么，你能给我一些解决问题的建议吗？

Answer 1

正如@PeterKluegl 所说，原始问题的解决方案是：

Sentence{-> NewAnnotation} <-{Sentence.part1<-{Constituent.label=="VB";} %
                              Sentence.part2<-{Constituent.label=="VB";};};

请注意，只有当 Sentence 特征（即 part1）是注释而不是像我的情况那样的字符串时，此规则才有效。

因此，对于潜在感兴趣的人，我 post 也针对我的情况提出了解决方案：

将 Sentence 功能存储在单独的注释中，但在 Sentence.part1 及其父 Sentence 之间保留 link（这在 UIMA 中可以通过父指针实现）。

应用以下规则：

String rutaRule = "STRING id;"
        + "STRING part1Id;"
        + "STRING part2Id;"
        + "Sentence{->GETFEATURE(\"matchId\", id)};"
        + "part1{->GETFEATURE(\"parent\", part1Id)};"
        + "part2{->GETFEATURE(\"parent\", part2Id)};"
        + "Sentence{AND(IF(id == part1Id), IF(id == part2Id))-> NewAnnotation} <-"
        + "{part1<-{Constituent.label == \"VBD\";} % "
        + "part2<-{Constituent.label == \"MD\" # Constituent.label == \"VBN\";};};";

Ruta.apply(cas,rutaRule);

希望对您有所帮助。

UIMA Ruta：通过在普通 Java 中组合现有注释的功能来创建新注释

UIMA Ruta: Creating new annotations by combining existing annotation's features in plain Java

java

annotations

uima

ruta