Apache nutch 注入 url

Apache nutch inject urls

我是 Apache Nutch(2.3.1) 和 mongodb(3.4.7) 的新手。安装步骤后,我想注入 url 并抓取维基百科网站。当我在终端中 运行 "./nutch inject urls" 时,我遇到了这个错误。

~/apache-nutch-2.3.1/runtime/local/bin$ ./nutch inject urls InjectorJob: starting at 2017-11-26 19:07:35 InjectorJob: Injecting urlDir: urls InjectorJob: org.apache.gora.util.GoraException: java.lang.NullPointerException at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:167) at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:135) at org.apache.nutch.storage.StorageUtils.createWebStore(StorageUtils.java:78) at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:218) at org.apache.nutch.crawl.InjectorJob.inject(InjectorJob.java:252) at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:275) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.nutch.crawl.InjectorJob.main(InjectorJob.java:284) Caused by: java.lang.NullPointerException at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:936) at java.util.concurrent.ConcurrentHashMap.containsKey(ConcurrentHashMap.java:964) at org.apache.gora.mongodb.store.MongoStore.getDB(MongoStore.java:192) at org.apache.gora.mongodb.store.MongoStore.initialize(MongoStore.java:122) at org.apache.gora.store.DataStoreFactory.initializeDataStore(DataStoreFactory.java:102) at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:161) ... 7 more

实际上我在 $NUTCH_HOME/conf/gora.properties 文件中设置了错误的 Mongo 数据库名称。修复后,Apache nutch 正常工作。