Uima Ruta 不一致词
Uima Ruta Inconsistency Word
我正在使用
标记连字词,例如离线、新列表、VBSE-in..等
(SW|CW|CAP) HYPHEN (SW|CW|CAP) HYPHEN (SW|CW|CAP) {-PARTOF(HyphenizationWord) ->MARK(ThreeHyphenizationWord,1,5)};
(SW|CW|CAP) HYPHEN (SW|CW|CAP) {-PARTOF(HyphenizationWord),-PARTOF(ThreeHyphenizationWord) ->MARK(HyphenizationWord,1,3),MARK(PreHyphenizationWords,1),MARK(PosHyphenixationWords,3)};
而且我一直想标记离线、新列表等词。
但是我的脚本错误地标记了一些单词,例如 VBSE 行中的 ..off。
DECLARE ComplexPreWord,ComplexPostWord;
//BLOCK (foreach) HyphenizationWord{}
//{
STRING PreWord;
STRINGLIST PreWordList;
PreHyphenizationWords{- >MATCHEDTEXT(PreWord),ADD(PreWordList,PreWord)};
W {INLIST(PreWordList)->ComplexPreWord};
STRING PostWord;
STRINGLIST PostWordList;
PosHyphenixationWords{- >MATCHEDTEXT(PostWord),ADD(PostWordList,PostWord)};
W {INLIST(PostWordList)->ComplexPostWord};
//}
ComplexPreWord ComplexPostWord{->MARK(ComplexWord,1,2)};
有任何方法可以解决我的问题..
不知道我是否理解正确你的问题,但也许这就是你想要的:
DECLARE Hyphen;
SPECIAL.ct == "-"{-> Hyphen};
DECLARE HyphenizationWord, PreHyphenizationWords, PosHyphenixationWords;
DECLARE HyphenizationWord ThreeHyphenizationWord;
(W @Hyphen{-PARTOF(HyphenizationWord)} W Hyphen W){-> ThreeHyphenizationWord};
(W{-> PreHyphenizationWords} @Hyphen{-PARTOF(HyphenizationWord)} W{-> PosHyphenixationWords}){-> HyphenizationWord};
STRINGLIST hyphenizationWordList;
STRING mt;
HyphenizationWord{-> MATCHEDTEXT(mt), ADD(hyphenizationWordList, replaceAll(mt, "[- ]", ""))};
DECLARE ComplexWord;
MARKFAST(ComplexWord,hyphenizationWordList);
脚本以您的规则开始(重写)。然后,将 HyphenizationWord 注释的覆盖文本存储在列表中,但预先删除破折号和空格。然后,此列表仅用于使用 MARKFAST 进行字典查找。
免责声明:我是 UIMA Ruta 的开发者
我正在使用
标记连字词,例如离线、新列表、VBSE-in..等(SW|CW|CAP) HYPHEN (SW|CW|CAP) HYPHEN (SW|CW|CAP) {-PARTOF(HyphenizationWord) ->MARK(ThreeHyphenizationWord,1,5)};
(SW|CW|CAP) HYPHEN (SW|CW|CAP) {-PARTOF(HyphenizationWord),-PARTOF(ThreeHyphenizationWord) ->MARK(HyphenizationWord,1,3),MARK(PreHyphenizationWords,1),MARK(PosHyphenixationWords,3)};
而且我一直想标记离线、新列表等词。 但是我的脚本错误地标记了一些单词,例如 VBSE 行中的 ..off。
DECLARE ComplexPreWord,ComplexPostWord;
//BLOCK (foreach) HyphenizationWord{}
//{
STRING PreWord;
STRINGLIST PreWordList;
PreHyphenizationWords{- >MATCHEDTEXT(PreWord),ADD(PreWordList,PreWord)};
W {INLIST(PreWordList)->ComplexPreWord};
STRING PostWord;
STRINGLIST PostWordList;
PosHyphenixationWords{- >MATCHEDTEXT(PostWord),ADD(PostWordList,PostWord)};
W {INLIST(PostWordList)->ComplexPostWord};
//}
ComplexPreWord ComplexPostWord{->MARK(ComplexWord,1,2)};
有任何方法可以解决我的问题..
不知道我是否理解正确你的问题,但也许这就是你想要的:
DECLARE Hyphen;
SPECIAL.ct == "-"{-> Hyphen};
DECLARE HyphenizationWord, PreHyphenizationWords, PosHyphenixationWords;
DECLARE HyphenizationWord ThreeHyphenizationWord;
(W @Hyphen{-PARTOF(HyphenizationWord)} W Hyphen W){-> ThreeHyphenizationWord};
(W{-> PreHyphenizationWords} @Hyphen{-PARTOF(HyphenizationWord)} W{-> PosHyphenixationWords}){-> HyphenizationWord};
STRINGLIST hyphenizationWordList;
STRING mt;
HyphenizationWord{-> MATCHEDTEXT(mt), ADD(hyphenizationWordList, replaceAll(mt, "[- ]", ""))};
DECLARE ComplexWord;
MARKFAST(ComplexWord,hyphenizationWordList);
脚本以您的规则开始(重写)。然后,将 HyphenizationWord 注释的覆盖文本存储在列表中,但预先删除破折号和空格。然后,此列表仅用于使用 MARKFAST 进行字典查找。
免责声明:我是 UIMA Ruta 的开发者