设置 OpenNLP：使用视觉进行文本检测 java API

Question

Set up OpenNLP
Download Tokenizer data and save it to this directory.

wget http://opennlp.sourceforge.net/models-1.5/en-token.bin

这是 Google 在 here

问我的

我不知道 OpenNLP 是什么，所以我 google 它。

这是 Apache 在其设置 OpenNLP 页面上所说的：

If you have an IDE installed such as NetBeans or Eclipse installed, it will make your development easier. However, follow on for the brave.

我有 Intellij NetBeans，我该如何设置它？

在 Intellij 中使用 Maven 构建并尝试运行这个示例时，这是我得到的错误：

java.io.FileNotFoundException: en-token.bin (The system cannot find the file specified)

当我尝试继续使用 Google 文档时，我在这一行失败了：

java -cp target/vision-text-1.0-SNAPSHOT-jar-with-dependencies.jar com.google.cloud.vision.samples.text.TextApp data/

出现错误：

Error: Could not find or load main class com.google.cloud.vision.samples.text.TextApp

Answer 1

您似乎只需要 OpenNLP 的 Tokeniser .bin 文件——这只是库用来标记文本（例如，将句子拆分成单词）的二进制文件。看起来你不需要从那个库中得到任何其他东西 - 如果你查看 google vision pom 文件（https://github.com/GoogleCloudPlatform/java-docs-samples/blob/master/vision/text/pom.xml）你会看到它们依赖于 OpenNLP JAR，所以你在这里只是获取该库的预训练资源。

假设您已经克隆了 GitHub 存储库，并且成功地运行他们提到的 maven 命令：

mvn clean compile assembly:single

然后应该将下载的文件 (en-token.bin) 复制到项目目录的根目录（与 pom.xml 相同的位置以及您运行宁 java 命令）。

如果设置是这样的，那么它应该可以正常工作。

设置 OpenNLP：使用视觉进行文本检测 java API

Set up OpenNLP: Text Detection using the Vision java API

java

netbeans

intellij-idea

maven

opennlp

我有 Intellij NetBeans，我该如何设置它？