将 lucene FST 文件从 5.1.0 迁移到 8.9.0
Migrate lucene FST files from 5.1.0 to 8.9.0
我有使用 lucene 5.1.0 创建的带有 FST 的文件。
升级到 lucene 8.9.0 后,当我尝试从文件中读取 FST 时出现异常:
org.apache.lucene.index.IndexFormatTooOldException: Format version is not supported (resource org.apache.lucene.store.InputStreamDataInput@34ce8af7): 4 (needs to be between 6 and 7). This version of Lucene only supports indexes created with release 6.0 and later.
有什么方法可以将旧的 FST 文件升级为新格式吗?
我是这样解决的
将FST中的所有内容写入文本文件:
public static <T> void writeToTextFile(FST<T> fst, Path filePath) throws IOException {
try (BufferedWriter writer = Files.newBufferedWriter(filePath)) {
BytesRefFSTEnum<T> fstEnum = new BytesRefFSTEnum<>(fst);
while (fstEnum.next() != null) {
BytesRefFSTEnum.InputOutput<T> inputOutput = fstEnum.current();
writer.write(inputOutput.input.utf8ToString() + "\t" + inputOutput.output.toString() + "\n");
}
}
}
将 lucene 版本更改为新版本并从文件中读取内容:
public static <T> FST<T> readFromTextFile(Path filePath, Outputs<T> outputs, Function<String, T> fromString) throws IOException {
Builder<T> builder = new Builder<>(FST.INPUT_TYPE.BYTE1, outputs);
IntsRefBuilder scratchInts = new IntsRefBuilder();
try (BufferedReader reader = Files.newBufferedReader(filePath)) {
String[] split = reader.readLine().split("\t");
BytesRef scratchBytes = new BytesRef(split[0]);
builder.add(Util.toIntsRef(scratchBytes, scratchInts), fromString.apply(split[1]));
}
return builder.finish();
}
我有使用 lucene 5.1.0 创建的带有 FST 的文件。
升级到 lucene 8.9.0 后,当我尝试从文件中读取 FST 时出现异常:
org.apache.lucene.index.IndexFormatTooOldException: Format version is not supported (resource org.apache.lucene.store.InputStreamDataInput@34ce8af7): 4 (needs to be between 6 and 7). This version of Lucene only supports indexes created with release 6.0 and later.
有什么方法可以将旧的 FST 文件升级为新格式吗?
我是这样解决的
将FST中的所有内容写入文本文件:
public static <T> void writeToTextFile(FST<T> fst, Path filePath) throws IOException {
try (BufferedWriter writer = Files.newBufferedWriter(filePath)) {
BytesRefFSTEnum<T> fstEnum = new BytesRefFSTEnum<>(fst);
while (fstEnum.next() != null) {
BytesRefFSTEnum.InputOutput<T> inputOutput = fstEnum.current();
writer.write(inputOutput.input.utf8ToString() + "\t" + inputOutput.output.toString() + "\n");
}
}
}
将 lucene 版本更改为新版本并从文件中读取内容:
public static <T> FST<T> readFromTextFile(Path filePath, Outputs<T> outputs, Function<String, T> fromString) throws IOException {
Builder<T> builder = new Builder<>(FST.INPUT_TYPE.BYTE1, outputs);
IntsRefBuilder scratchInts = new IntsRefBuilder();
try (BufferedReader reader = Files.newBufferedReader(filePath)) {
String[] split = reader.readLine().split("\t");
BytesRef scratchBytes = new BytesRef(split[0]);
builder.add(Util.toIntsRef(scratchBytes, scratchInts), fromString.apply(split[1]));
}
return builder.finish();
}