通过 3.5 索引问题将数据从 Solr 1.4 迁移到 4.10
Migrating data from Solr 1.4 to 4.10 through 3.5 Index issue
我正在尝试使用 Solr 3.5 作为两个主要版本之间的中间状态,从 Solr 1.4 升级到 Solr 4.10。我认为我迁移的数据集非常大,即数据文件夹大小超过 13 GB。已成功从 1.4 迁移到 3.5。我已经复制了集合的数据文件夹,并将 conf 文件夹从 Solr 3.5 复制到 Solr 4.10。但是,我收到以下错误:
** ERROR CoreContainer Error creating core [newsarchive]: Error opening new searcher**
详细日志数据为:
org.apache.solr.common.SolrException: Error opening new searcher
at org.apache.solr.core.SolrCore.<init>(SolrCore.java:873)
at org.apache.solr.core.SolrCore.<init>(SolrCore.java:646)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:491)
at org.apache.solr.core.CoreContainer.call(CoreContainer.java:255)
at org.apache.solr.core.CoreContainer.call(CoreContainer.java:249)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Caused by: org.apache.solr.common.SolrException: Error opening new searcher
at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1565)
at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1677)
at org.apache.solr.core.SolrCore.<init>(SolrCore.java:845)
... 8 more
Caused by: org.apache.lucene.index.IndexFormatTooOldException: Format version is not supported (resource: MMapIndexInput(path="C:\news\data\newsarchive\index\_4p.fdx")): 1 (needs to be between 2 and 3). This version of Lucene only supports indexes created with release 3.0 and later.
at org.apache.lucene.codecs.lucene3x.Lucene3xStoredFieldsReader.checkCodeVersion(Lucene3xStoredFieldsReader.java:121)
at org.apache.lucene.codecs.lucene3x.Lucene3xSegmentInfoReader.readLegacyInfos(Lucene3xSegmentInfoReader.java:75)
at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:418)
at org.apache.lucene.index.SegmentInfos.doBody(SegmentInfos.java:458)
at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:913)
at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:759)
at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:454)
at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:794)
at org.apache.solr.update.SolrIndexWriter.<init>(SolrIndexWriter.java:77)
at org.apache.solr.update.SolrIndexWriter.create(SolrIndexWriter.java:64)
at org.apache.solr.update.DefaultSolrCoreState.createMainIndexWriter(DefaultSolrCoreState.java:279)
at org.apache.solr.update.DefaultSolrCoreState.getIndexWriter(DefaultSolrCoreState.java:111)
at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1528)
... 10 more
此外,我在日志中发现上述错误之后的末尾有以下错误:
**
ERROR SolrIndexWriter SolrIndexWriter was not closed prior to
finalize(), indicates a bug -- POSSIBLE RESOURCE LEAK!!!
和
ERROR SolrIndexWriter Error closing IndexWriter
当 Solr 打开旧索引时,它会尽可能少地更新它。通常,这意味着旧片段 不会 以新格式重写,直到它们被合并。这节省了很多I/O,并且拥有大索引的人可能不希望每次索引格式发生轻微变化时都重写他们的整个索引。但是,这意味着您必须做一些额外的工作才能从 Solr 的旧版本(落后 > 1 个主要版本)迁移索引。推荐的过程是使用 org.apache.lucene.index.IndexUpgrader
升级索引而不合并。最简单的事情是获取 lucene-core 3.5 from Maven and run java -cp lucene-core.jar org.apache.lucene.index.IndexUpgrader [-delete-prior-commits] [-verbose] indexDir
, then repeat with lucene-core 4.10.
我正在尝试使用 Solr 3.5 作为两个主要版本之间的中间状态,从 Solr 1.4 升级到 Solr 4.10。我认为我迁移的数据集非常大,即数据文件夹大小超过 13 GB。已成功从 1.4 迁移到 3.5。我已经复制了集合的数据文件夹,并将 conf 文件夹从 Solr 3.5 复制到 Solr 4.10。但是,我收到以下错误:
** ERROR CoreContainer Error creating core [newsarchive]: Error opening new searcher**
详细日志数据为:
org.apache.solr.common.SolrException: Error opening new searcher
at org.apache.solr.core.SolrCore.<init>(SolrCore.java:873)
at org.apache.solr.core.SolrCore.<init>(SolrCore.java:646)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:491)
at org.apache.solr.core.CoreContainer.call(CoreContainer.java:255)
at org.apache.solr.core.CoreContainer.call(CoreContainer.java:249)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Caused by: org.apache.solr.common.SolrException: Error opening new searcher
at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1565)
at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1677)
at org.apache.solr.core.SolrCore.<init>(SolrCore.java:845)
... 8 more
Caused by: org.apache.lucene.index.IndexFormatTooOldException: Format version is not supported (resource: MMapIndexInput(path="C:\news\data\newsarchive\index\_4p.fdx")): 1 (needs to be between 2 and 3). This version of Lucene only supports indexes created with release 3.0 and later.
at org.apache.lucene.codecs.lucene3x.Lucene3xStoredFieldsReader.checkCodeVersion(Lucene3xStoredFieldsReader.java:121)
at org.apache.lucene.codecs.lucene3x.Lucene3xSegmentInfoReader.readLegacyInfos(Lucene3xSegmentInfoReader.java:75)
at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:418)
at org.apache.lucene.index.SegmentInfos.doBody(SegmentInfos.java:458)
at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:913)
at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:759)
at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:454)
at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:794)
at org.apache.solr.update.SolrIndexWriter.<init>(SolrIndexWriter.java:77)
at org.apache.solr.update.SolrIndexWriter.create(SolrIndexWriter.java:64)
at org.apache.solr.update.DefaultSolrCoreState.createMainIndexWriter(DefaultSolrCoreState.java:279)
at org.apache.solr.update.DefaultSolrCoreState.getIndexWriter(DefaultSolrCoreState.java:111)
at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1528)
... 10 more
此外,我在日志中发现上述错误之后的末尾有以下错误:
**
ERROR SolrIndexWriter SolrIndexWriter was not closed prior to finalize(), indicates a bug -- POSSIBLE RESOURCE LEAK!!!
和
ERROR SolrIndexWriter Error closing IndexWriter
当 Solr 打开旧索引时,它会尽可能少地更新它。通常,这意味着旧片段 不会 以新格式重写,直到它们被合并。这节省了很多I/O,并且拥有大索引的人可能不希望每次索引格式发生轻微变化时都重写他们的整个索引。但是,这意味着您必须做一些额外的工作才能从 Solr 的旧版本(落后 > 1 个主要版本)迁移索引。推荐的过程是使用 org.apache.lucene.index.IndexUpgrader
升级索引而不合并。最简单的事情是获取 lucene-core 3.5 from Maven and run java -cp lucene-core.jar org.apache.lucene.index.IndexUpgrader [-delete-prior-commits] [-verbose] indexDir
, then repeat with lucene-core 4.10.