集成nutch 2.3、Hbase和Solr时索引时间过长
Taking too much time in indexing while integrating nutch 2.3, Hbase and Solr
我正在整合 Nutch、Hbase 和 Solr。
我配置了Nutch、Hbase和Solr,也做了抓取网站的操作但是在将Nutch和Solr集成时,按照这个
Integrating Nutch 2.3, HBase and Solr,我执行了命令
java jar start.jar 在 /opt/solr-4.8.1/examples 中。
进程已启动,但 执行过程花费了大约 10 天的时间, 现在仍然是 运行。
我无法找出它出了什么问题。
任何人都可以提出问题是什么以及如何解决。
以下是日志文件的一些详细信息。
INFO - 2016-05-18 15:58:00.286; org.apache.solr.update.DirectUpdateHandler2; start commit{,optimize=true,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
INFO - 2016-05-18 15:58:00.287; org.apache.solr.update.DirectUpdateHandler2; No uncommitted changes. Skipping IW.commit.
INFO - 2016-05-18 15:58:00.287; org.apache.solr.core.SolrCore; SolrIndexSearcher has not changed - not re-opening: org.apache.solr.search.SolrIndexSearcher
INFO - 2016-05-18 15:58:00.288; org.apache.solr.update.DirectUpdateHandler2; end_commit_flush
INFO - 2016-05-18 15:58:00.288; org.apache.solr.update.processor.LogUpdateProcessor; [collection1] webapp=/solr path=/update params={waitFlush=true&optimize=true&wt=json&_=1463567280272} {optimize=} 0 2
INFO - 2016-05-18 15:58:01.976; org.apache.solr.update.DirectUpdateHandler2; start commit{,optimize=true,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
INFO - 2016-05-18 15:58:01.976; org.apache.solr.update.DirectUpdateHandler2; No uncommitted changes. Skipping IW.commit.
INFO - 2016-05-18 15:58:01.977; org.apache.solr.core.SolrCore; SolrIndexSearcher has not changed - not re-opening: org.apache.solr.search.SolrIndexSearcher
INFO - 2016-05-18 15:58:01.977; org.apache.solr.update.DirectUpdateHandler2; end_commit_flush
INFO - 2016-05-18 15:58:01.978; org.apache.solr.update.processor.LogUpdateProcessor; [collection1] webapp=/solr path=/update params={waitFlush=true&optimize=true&wt=json&_=1463567281965} {optimize=} 0 2
INFO - 2016-05-18 15:58:05.799; org.apache.solr.servlet.SolrDispatchFilter; [admin] webapp=null path=/admin/info/threads params={wt=json&_=1463567285780} status=0 QTime=8
INFO - 2016-05-18 15:58:09.267; org.apache.solr.servlet.SolrDispatchFilter; [admin] webapp=null path=/admin/info/properties params={wt=json&_=1463567289183} status=0 QTime=0
INFO - 2016-05-18 15:58:11.225; org.apache.solr.servlet.SolrDispatchFilter; [admin] webapp=null path=/admin/cores params={wt=json&_=1463567291213} status=0 QTime=1
INFO - 2016-05-18 15:58:11.260; org.apache.solr.servlet.SolrDispatchFilter; [admin] webapp=null path=/admin/cores params={wt=json&_=1463567291242} status=0 QTime=1
INFO - 2016-05-18 15:58:13.808; org.apache.solr.core.SolrCore; [collection1] webapp=/solr path=/admin/luke params={show=index&numTerms=0&wt=json&_=1463567293791} status=0 QTime=1
INFO - 2016-05-18 15:58:13.821; org.apache.solr.core.SolrCore; [collection1] webapp=/solr path=/replication params={wt=json&command=details&_=1463567293794} status=0 QTime=1
INFO - 2016-05-18 15:58:13.837; org.apache.solr.core.SolrCore; [collection1] webapp=/solr path=/admin/system params={wt=json&_=1463567293796} status=0 QTime=4
INFO - 2016-05-18 15:58:13.845; org.apache.solr.core.SolrCore; [collection1] webapp=/solr path=/admin/file/ params={file=admin-extra.html&_=1463567293798} status=0 QTime=0
INFO - 2016-05-18 15:58:13.856; org.apache.solr.core.SolrCore; [collection1] webapp=/solr path=/admin/ping params={action=status&wt=json&_=1463567293801} status=503 QTime=1
INFO - 2016-05-18 16:54:35.235; org.apache.solr.servlet.SolrDispatchFilter; [admin] webapp=null path=/admin/info/logging params={wt=json&since=0&_=1463570675193} status=0 QTime=1
INFO - 2016-05-18 16:54:38.820; org.apache.solr.core.SolrCore; [collection1] webapp=/solr path=/replication params={wt=json&command=details&_=1463570678769} status=0 QTime=0
INFO - 2016-05-18 16:54:38.821; org.apache.solr.core.SolrCore; [collection1] webapp=/solr path=/admin/luke params={show=index&numTerms=0&wt=json&_=1463570678764} status=0 QTime=2
INFO - 2016-05-18 16:54:38.823; org.apache.solr.core.SolrCore; [collection1] webapp=/solr path=/admin/ping params={action=status&wt=json&_=1463570678776} status=503 QTime=0
INFO - 2016-05-18 16:54:38.829; org.apache.solr.core.SolrCore; [collection1] webapp=/solr path=/admin/file/ params={file=admin-extra.html&_=1463570678774} status=0 QTime=1
INFO - 2016-05-18 16:54:38.831; org.apache.solr.core.SolrCore; [collection1] webapp=/solr path=/admin/system params={wt=json&_=1463570678772} status=0 QTime=11
INFO - 2016-05-18 16:54:46.728; org.apache.solr.core.SolrCore; [collection1] webapp=/solr path=/admin/mbeans params={stats=true&wt=json&_=1463570686705} status=0 QTime=5
INFO - 2016-05-18 16:54:49.533; org.apache.solr.core.SolrCore; [collection1] webapp=/solr path=/admin/mbeans params={stats=true&wt=json&_=1463570689477} status=0 QTime=3
INFO - 2016-05-18 16:54:52.762; org.apache.solr.core.SolrCore; [collection1] webapp=/solr path=/replication params={wt=json&command=details&_=1463570692692} status=0 QTime=0
INFO - 2016-05-18 16:56:33.180; org.apache.solr.servlet.SolrDispatchFilter; [admin] webapp=null path=/admin/info/logging params={wt=json&since=0&_=1463570793166} status=0 QTime=0
INFO - 2016-05-18 16:56:38.195; org.apache.solr.core.SolrCore; [collection1] webapp=/solr path=/admin/luke params={show=index&numTerms=0&wt=json&_=1463570798128} status=0 QTime=0
INFO - 2016-05-18 16:56:38.198; org.apache.solr.core.SolrCore; [collection1] webapp=/solr path=/replication params={wt=json&command=details&_=1463570798132} status=0 QTime=0
INFO - 2016-05-18 16:56:38.199; org.apache.solr.core.SolrCore; [collection1] webapp=/solr path=/admin/ping params={action=status&wt=json&_=1463570798137} status=503 QTime=0
INFO - 2016-05-18 16:56:38.201; org.apache.solr.core.SolrCore; [collection1] webapp=/solr path=/admin/file/ params={file=admin-extra.html&_=1463570798135} status=0 QTime=0
INFO - 2016-05-18 16:56:38.211; org.apache.solr.core.SolrCore; [collection1] webapp=/solr path=/admin/system params={wt=json&_=1463570798133} status=0 QTime=12
终于解决了。
基本上,java -jar start.jar 下载 jar 文件,所以它不是在这里做索引而是下载 Solr 4.8 jar 然后配置 it.I由于性能原因,用 Solr 5.2.1 替换了 Solr 4.8,现在 Solr 工作正常。
我正在整合 Nutch、Hbase 和 Solr。
我配置了Nutch、Hbase和Solr,也做了抓取网站的操作但是在将Nutch和Solr集成时,按照这个 Integrating Nutch 2.3, HBase and Solr,我执行了命令 java jar start.jar 在 /opt/solr-4.8.1/examples 中。
进程已启动,但 执行过程花费了大约 10 天的时间, 现在仍然是 运行。
我无法找出它出了什么问题。 任何人都可以提出问题是什么以及如何解决。
以下是日志文件的一些详细信息。
INFO - 2016-05-18 15:58:00.286; org.apache.solr.update.DirectUpdateHandler2; start commit{,optimize=true,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
INFO - 2016-05-18 15:58:00.287; org.apache.solr.update.DirectUpdateHandler2; No uncommitted changes. Skipping IW.commit.
INFO - 2016-05-18 15:58:00.287; org.apache.solr.core.SolrCore; SolrIndexSearcher has not changed - not re-opening: org.apache.solr.search.SolrIndexSearcher
INFO - 2016-05-18 15:58:00.288; org.apache.solr.update.DirectUpdateHandler2; end_commit_flush
INFO - 2016-05-18 15:58:00.288; org.apache.solr.update.processor.LogUpdateProcessor; [collection1] webapp=/solr path=/update params={waitFlush=true&optimize=true&wt=json&_=1463567280272} {optimize=} 0 2
INFO - 2016-05-18 15:58:01.976; org.apache.solr.update.DirectUpdateHandler2; start commit{,optimize=true,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
INFO - 2016-05-18 15:58:01.976; org.apache.solr.update.DirectUpdateHandler2; No uncommitted changes. Skipping IW.commit.
INFO - 2016-05-18 15:58:01.977; org.apache.solr.core.SolrCore; SolrIndexSearcher has not changed - not re-opening: org.apache.solr.search.SolrIndexSearcher
INFO - 2016-05-18 15:58:01.977; org.apache.solr.update.DirectUpdateHandler2; end_commit_flush
INFO - 2016-05-18 15:58:01.978; org.apache.solr.update.processor.LogUpdateProcessor; [collection1] webapp=/solr path=/update params={waitFlush=true&optimize=true&wt=json&_=1463567281965} {optimize=} 0 2
INFO - 2016-05-18 15:58:05.799; org.apache.solr.servlet.SolrDispatchFilter; [admin] webapp=null path=/admin/info/threads params={wt=json&_=1463567285780} status=0 QTime=8
INFO - 2016-05-18 15:58:09.267; org.apache.solr.servlet.SolrDispatchFilter; [admin] webapp=null path=/admin/info/properties params={wt=json&_=1463567289183} status=0 QTime=0
INFO - 2016-05-18 15:58:11.225; org.apache.solr.servlet.SolrDispatchFilter; [admin] webapp=null path=/admin/cores params={wt=json&_=1463567291213} status=0 QTime=1
INFO - 2016-05-18 15:58:11.260; org.apache.solr.servlet.SolrDispatchFilter; [admin] webapp=null path=/admin/cores params={wt=json&_=1463567291242} status=0 QTime=1
INFO - 2016-05-18 15:58:13.808; org.apache.solr.core.SolrCore; [collection1] webapp=/solr path=/admin/luke params={show=index&numTerms=0&wt=json&_=1463567293791} status=0 QTime=1
INFO - 2016-05-18 15:58:13.821; org.apache.solr.core.SolrCore; [collection1] webapp=/solr path=/replication params={wt=json&command=details&_=1463567293794} status=0 QTime=1
INFO - 2016-05-18 15:58:13.837; org.apache.solr.core.SolrCore; [collection1] webapp=/solr path=/admin/system params={wt=json&_=1463567293796} status=0 QTime=4
INFO - 2016-05-18 15:58:13.845; org.apache.solr.core.SolrCore; [collection1] webapp=/solr path=/admin/file/ params={file=admin-extra.html&_=1463567293798} status=0 QTime=0
INFO - 2016-05-18 15:58:13.856; org.apache.solr.core.SolrCore; [collection1] webapp=/solr path=/admin/ping params={action=status&wt=json&_=1463567293801} status=503 QTime=1
INFO - 2016-05-18 16:54:35.235; org.apache.solr.servlet.SolrDispatchFilter; [admin] webapp=null path=/admin/info/logging params={wt=json&since=0&_=1463570675193} status=0 QTime=1
INFO - 2016-05-18 16:54:38.820; org.apache.solr.core.SolrCore; [collection1] webapp=/solr path=/replication params={wt=json&command=details&_=1463570678769} status=0 QTime=0
INFO - 2016-05-18 16:54:38.821; org.apache.solr.core.SolrCore; [collection1] webapp=/solr path=/admin/luke params={show=index&numTerms=0&wt=json&_=1463570678764} status=0 QTime=2
INFO - 2016-05-18 16:54:38.823; org.apache.solr.core.SolrCore; [collection1] webapp=/solr path=/admin/ping params={action=status&wt=json&_=1463570678776} status=503 QTime=0
INFO - 2016-05-18 16:54:38.829; org.apache.solr.core.SolrCore; [collection1] webapp=/solr path=/admin/file/ params={file=admin-extra.html&_=1463570678774} status=0 QTime=1
INFO - 2016-05-18 16:54:38.831; org.apache.solr.core.SolrCore; [collection1] webapp=/solr path=/admin/system params={wt=json&_=1463570678772} status=0 QTime=11
INFO - 2016-05-18 16:54:46.728; org.apache.solr.core.SolrCore; [collection1] webapp=/solr path=/admin/mbeans params={stats=true&wt=json&_=1463570686705} status=0 QTime=5
INFO - 2016-05-18 16:54:49.533; org.apache.solr.core.SolrCore; [collection1] webapp=/solr path=/admin/mbeans params={stats=true&wt=json&_=1463570689477} status=0 QTime=3
INFO - 2016-05-18 16:54:52.762; org.apache.solr.core.SolrCore; [collection1] webapp=/solr path=/replication params={wt=json&command=details&_=1463570692692} status=0 QTime=0
INFO - 2016-05-18 16:56:33.180; org.apache.solr.servlet.SolrDispatchFilter; [admin] webapp=null path=/admin/info/logging params={wt=json&since=0&_=1463570793166} status=0 QTime=0
INFO - 2016-05-18 16:56:38.195; org.apache.solr.core.SolrCore; [collection1] webapp=/solr path=/admin/luke params={show=index&numTerms=0&wt=json&_=1463570798128} status=0 QTime=0
INFO - 2016-05-18 16:56:38.198; org.apache.solr.core.SolrCore; [collection1] webapp=/solr path=/replication params={wt=json&command=details&_=1463570798132} status=0 QTime=0
INFO - 2016-05-18 16:56:38.199; org.apache.solr.core.SolrCore; [collection1] webapp=/solr path=/admin/ping params={action=status&wt=json&_=1463570798137} status=503 QTime=0
INFO - 2016-05-18 16:56:38.201; org.apache.solr.core.SolrCore; [collection1] webapp=/solr path=/admin/file/ params={file=admin-extra.html&_=1463570798135} status=0 QTime=0
INFO - 2016-05-18 16:56:38.211; org.apache.solr.core.SolrCore; [collection1] webapp=/solr path=/admin/system params={wt=json&_=1463570798133} status=0 QTime=12
终于解决了。 基本上,java -jar start.jar 下载 jar 文件,所以它不是在这里做索引而是下载 Solr 4.8 jar 然后配置 it.I由于性能原因,用 Solr 5.2.1 替换了 Solr 4.8,现在 Solr 工作正常。