OpenNLP find() 方法

Question

目前我正在尝试在文档中查找姓名。我使用以下方法查找名称：

find(String[] tokens)

我也在下面找到了这个方法：

find(String[] tokens,String[][] additionalContext)

我可以使用此方法做什么以及如何使用它？

Answer 1

根据opennlp.tools.namefind.NameFinderME apidocs：

public Span[] find(String[] tokens, String[][] additionalContext)
Generates name tags for the given sequence, typically a sentence, returning token spans for any identified names.

Parameters:

tokens - an array of the tokens or words of the sequence, typically a sentence.

additionalContext - features which are based on context outside of the sentence but which should also be used.

Returns: an array of spans for each of the names identified.

话虽这么说，请考虑您的代币是：

String[] tokens = { "lorem", "ipsum", "dolor", "sit", "amet", "adipiscing", "elit" };

但您还需要考虑以下特征，“基于句子之外的上下文但也应使用”：

String[][] additionalContext = { 
    { "nullam", "fermentum", "justo", "non", "leo", "rhoncus", "blandit" },
    { "phasellus", "at", "diam", "mattis", "arcu", "congue", "consequat" },
    { "integer", "at", "tincidunt", "turpis", "eget", "pulvinar", "nisl" } };

这样你就可以调用find(tokens, additionalContext).

请注意，根据 code on GitHub，find(String[] tokens) 实际上是 find(tokens, EMPTY)（和 String[][] EMPTY = new String[0][0]）。

OpenNLP find() 方法

OpenNLP find() method

java

text

text-mining

opennlp