在 Java 中读取 Solr 索引文件的内部结构
Reading internals of Solr index file in Java
我正在尝试读取 Solr 索引文件。此文件由版本 6.4 中的 Solr 下载页面的示例创建。
我正在使用此代码:
import java.io.File;
import java.io.IOException;
import org.apache.lucene.document.Document;
import org.apache.lucene.index.IndexReader;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.FSDirectory;
public class TestIndex {
public static void main(String[] args) throws IOException {
Directory dirIndex = FSDirectory.open(new File("D:\data\data\index"));
IndexReader indexReader = IndexReader.open(dirIndex);
Document doc = null;
for(int i = 0; i < indexReader.numDocs(); i++) {
doc = indexReader.document(i);
}
System.out.println(doc.toString());
indexReader.close();
dirIndex.close();
}
}
Solr 罐子:solr-solrj-6.5.1.jar
Lucene : lucene-核心-r1211247.jar
异常:
Exception in thread "main"
org.apache.lucene.index.IndexFormatTooOldException: Format version is not
supported (resource:
ChecksumIndexInput(MMapIndexInput(path="D:\data\data\index\segments_2"))):
1071082519 (needs to be between -9 and -12). This version of Lucene only
supports indexes created with release 3.0 and later.
使用 lucene 6.5.1 更新了代码
Path path = FileSystems.getDefault().getPath("D:\data\data\index");
Directory dirIndex = FSDirectory.open(path);
DirectoryReader dr = DirectoryReader.open(dirIndex);
Document doc = null;
for(int i = 0; i < dr.numDocs(); i++) {
doc = dr.document(i);
}
System.out.println(doc.toString());
dr.close();
dirIndex.close();
异常:
java.lang.UnsupportedClassVersionError: org/apache/lucene/store/Directory : Unsupported major.minor version 52.0.
你能帮我 运行 这个代码吗?
谢谢
维伦德拉·阿加瓦尔
那个 lucene-jar 好像是 2012 年的,所以已经有五年多了。使用 lucene-core-6.5.1 读取 Solr 6.5.1 生成的索引文件。
如果它错误地选择了任意命名的文件,您可以将依赖项固定在构建文件中。
我建议使用卢克
https://github.com/DmitryKey/luke
Luke is the GUI tool for introspecting your Lucene / Solr / Elasticsearch index. It allows:
- Viewing your documents and analyzing their field contents (for stored fields) Searching in the index
- Performing index maintenance: index health checking, index optimization (take a - backup before running this!)
- Reading index from hdfs
- Exporting the index or portion of it into an xml format
- Testing your custom Lucene analyzers
- Creating your own plugins!
我正在尝试读取 Solr 索引文件。此文件由版本 6.4 中的 Solr 下载页面的示例创建。
我正在使用此代码:
import java.io.File;
import java.io.IOException;
import org.apache.lucene.document.Document;
import org.apache.lucene.index.IndexReader;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.FSDirectory;
public class TestIndex {
public static void main(String[] args) throws IOException {
Directory dirIndex = FSDirectory.open(new File("D:\data\data\index"));
IndexReader indexReader = IndexReader.open(dirIndex);
Document doc = null;
for(int i = 0; i < indexReader.numDocs(); i++) {
doc = indexReader.document(i);
}
System.out.println(doc.toString());
indexReader.close();
dirIndex.close();
}
}
Solr 罐子:solr-solrj-6.5.1.jar
Lucene : lucene-核心-r1211247.jar
异常:
Exception in thread "main"
org.apache.lucene.index.IndexFormatTooOldException: Format version is not
supported (resource:
ChecksumIndexInput(MMapIndexInput(path="D:\data\data\index\segments_2"))):
1071082519 (needs to be between -9 and -12). This version of Lucene only
supports indexes created with release 3.0 and later.
使用 lucene 6.5.1 更新了代码
Path path = FileSystems.getDefault().getPath("D:\data\data\index");
Directory dirIndex = FSDirectory.open(path);
DirectoryReader dr = DirectoryReader.open(dirIndex);
Document doc = null;
for(int i = 0; i < dr.numDocs(); i++) {
doc = dr.document(i);
}
System.out.println(doc.toString());
dr.close();
dirIndex.close();
异常:
java.lang.UnsupportedClassVersionError: org/apache/lucene/store/Directory : Unsupported major.minor version 52.0.
你能帮我 运行 这个代码吗?
谢谢
维伦德拉·阿加瓦尔
那个 lucene-jar 好像是 2012 年的,所以已经有五年多了。使用 lucene-core-6.5.1 读取 Solr 6.5.1 生成的索引文件。
如果它错误地选择了任意命名的文件,您可以将依赖项固定在构建文件中。
我建议使用卢克
https://github.com/DmitryKey/luke
Luke is the GUI tool for introspecting your Lucene / Solr / Elasticsearch index. It allows:
- Viewing your documents and analyzing their field contents (for stored fields) Searching in the index
- Performing index maintenance: index health checking, index optimization (take a - backup before running this!)
- Reading index from hdfs
- Exporting the index or portion of it into an xml format
- Testing your custom Lucene analyzers
- Creating your own plugins!