OpenNLP find() 方法

OpenNLP find() method

目前我正在尝试在文档中查找姓名。我使用以下方法查找名称:

find(String[] tokens)

我也在下面找到了这个方法:

find(String[] tokens,String[][] additionalContext)

我可以使用此方法做什么以及如何使用它?

根据opennlp.tools.namefind.NameFinderME apidocs

public Span[] find(String[] tokens, String[][] additionalContext)

Generates name tags for the given sequence, typically a sentence, returning token spans for any identified names.

Parameters:

  • tokens - an array of the tokens or words of the sequence, typically a sentence.
  • additionalContext - features which are based on context outside of the sentence but which should also be used.

Returns: an array of spans for each of the names identified.

话虽这么说,请考虑您的代币是:

String[] tokens = { "lorem", "ipsum", "dolor", "sit", "amet", "adipiscing", "elit" };

但您还需要考虑以下特征,“基于句子之外的上下文但也应使用”:

String[][] additionalContext = { 
    { "nullam", "fermentum", "justo", "non", "leo", "rhoncus", "blandit" },
    { "phasellus", "at", "diam", "mattis", "arcu", "congue", "consequat" },
    { "integer", "at", "tincidunt", "turpis", "eget", "pulvinar", "nisl" } };

这样你就可以调用find(tokens, additionalContext).

请注意,根据 code on GitHubfind(String[] tokens) 实际上是 find(tokens, EMPTY)(和 String[][] EMPTY = new String[0][0])。