OpenNLP find() 方法
OpenNLP find() method
目前我正在尝试在文档中查找姓名。我使用以下方法查找名称:
find(String[] tokens)
我也在下面找到了这个方法:
find(String[] tokens,String[][] additionalContext)
我可以使用此方法做什么以及如何使用它?
根据opennlp.tools.namefind.NameFinderME apidocs:
public Span[] find(String[] tokens, String[][] additionalContext)
Generates name tags for the given sequence, typically a sentence,
returning token spans for any identified names.
Parameters:
tokens
- an array of the tokens or words of the sequence, typically a sentence.
additionalContext
- features which are based on context outside of the sentence but which should also be used.
Returns:
an array of spans for each of the names identified.
话虽这么说,请考虑您的代币是:
String[] tokens = { "lorem", "ipsum", "dolor", "sit", "amet", "adipiscing", "elit" };
但您还需要考虑以下特征,“基于句子之外的上下文但也应使用”:
String[][] additionalContext = {
{ "nullam", "fermentum", "justo", "non", "leo", "rhoncus", "blandit" },
{ "phasellus", "at", "diam", "mattis", "arcu", "congue", "consequat" },
{ "integer", "at", "tincidunt", "turpis", "eget", "pulvinar", "nisl" } };
这样你就可以调用find(tokens, additionalContext)
.
请注意,根据 code on GitHub,find(String[] tokens)
实际上是 find(tokens, EMPTY)
(和 String[][] EMPTY = new String[0][0]
)。
目前我正在尝试在文档中查找姓名。我使用以下方法查找名称:
find(String[] tokens)
我也在下面找到了这个方法:
find(String[] tokens,String[][] additionalContext)
我可以使用此方法做什么以及如何使用它?
根据opennlp.tools.namefind.NameFinderME apidocs:
public Span[] find(String[] tokens, String[][] additionalContext)
Generates name tags for the given sequence, typically a sentence, returning token spans for any identified names.
Parameters:
tokens
- an array of the tokens or words of the sequence, typically a sentence.additionalContext
- features which are based on context outside of the sentence but which should also be used.Returns: an array of spans for each of the names identified.
话虽这么说,请考虑您的代币是:
String[] tokens = { "lorem", "ipsum", "dolor", "sit", "amet", "adipiscing", "elit" };
但您还需要考虑以下特征,“基于句子之外的上下文但也应使用”:
String[][] additionalContext = {
{ "nullam", "fermentum", "justo", "non", "leo", "rhoncus", "blandit" },
{ "phasellus", "at", "diam", "mattis", "arcu", "congue", "consequat" },
{ "integer", "at", "tincidunt", "turpis", "eget", "pulvinar", "nisl" } };
这样你就可以调用find(tokens, additionalContext)
.
请注意,根据 code on GitHub,find(String[] tokens)
实际上是 find(tokens, EMPTY)
(和 String[][] EMPTY = new String[0][0]
)。