Lucene 索引中的文档和字段实例重用
Document and Field instance reuse in Lucene Indexing
我正在尝试重用 Document 和 Field 实例来提高性能(我已经在文件中尝试了 100 万行,但没有重用它花费了 20 秒的实例)。
但是当我尝试这样做时,它花费了太多时间并且它一直在运行。
有人遇到过同样的问题吗?
这是在尝试重用实例之前的现有代码,对于我正在创建新文档和字段的文件中的每一行。
FileInputStream fis;
try {
fis = new FileInputStream(file);
String filePath= file.getPath();
BufferedReader br = new BufferedReader(
new InputStreamReader(fis, StandardCharsets.UTF_8));
String line = null;
while ((line = br.readLine()) != null) {
String[] lineTokens = line.split("\|");
Document doc = new Document();
Field field1 = new TextField("field1", field1Value, Field.Store.YES);
doc.add(field1);
Field field2 = new StringField("field2", field2Value,Field.Store.YES);
doc.add(field2);
writer.addDocument(doc);
}
br.close();
} catch (FileNotFoundException fnfe) {
}
改变后
FileInputStream fis;
try {
fis = new FileInputStream(file);
String filePath= file.getPath();
BufferedReader br = new BufferedReader(
new InputStreamReader(fis, StandardCharsets.UTF_8));
String line = null;
Document doc = new Document();
Field field1 = new TextField("field1", field1Value, Field.Store.YES);
Field field2 = new StringField("field2", field2Value,Field.Store.YES);
while ((line = br.readLine()) != null) {
//String[] lineTokens = line.split("\|");
field1.setStringValue("field1Value");
doc.add(field1);
field2.setStringValue("field2Value");
doc.add(field2);
writer.addDocument(doc);
}
br.close();
} catch (FileNotFoundException fnfe) {
}
您不需要在每次迭代时都将字段添加到文档中。添加一次字段后,您需要做的就是更改字段值,然后将更改后的文档写入索引,如下所示:
Document doc = new Document();
Field field1 = new TextField("field1", field1Value, Field.Store.YES);
doc.add(field1);
Field field2 = new StringField("field2", field2Value,Field.Store.YES);
doc.add(field2);
while ((line = br.readLine()) != null) {
field1.setStringValue("field1Value");
field2.setStringValue("field2Value");
writer.addDocument(doc);
}
我正在尝试重用 Document 和 Field 实例来提高性能(我已经在文件中尝试了 100 万行,但没有重用它花费了 20 秒的实例)。
但是当我尝试这样做时,它花费了太多时间并且它一直在运行。
有人遇到过同样的问题吗?
这是在尝试重用实例之前的现有代码,对于我正在创建新文档和字段的文件中的每一行。
FileInputStream fis;
try {
fis = new FileInputStream(file);
String filePath= file.getPath();
BufferedReader br = new BufferedReader(
new InputStreamReader(fis, StandardCharsets.UTF_8));
String line = null;
while ((line = br.readLine()) != null) {
String[] lineTokens = line.split("\|");
Document doc = new Document();
Field field1 = new TextField("field1", field1Value, Field.Store.YES);
doc.add(field1);
Field field2 = new StringField("field2", field2Value,Field.Store.YES);
doc.add(field2);
writer.addDocument(doc);
}
br.close();
} catch (FileNotFoundException fnfe) {
}
改变后
FileInputStream fis;
try {
fis = new FileInputStream(file);
String filePath= file.getPath();
BufferedReader br = new BufferedReader(
new InputStreamReader(fis, StandardCharsets.UTF_8));
String line = null;
Document doc = new Document();
Field field1 = new TextField("field1", field1Value, Field.Store.YES);
Field field2 = new StringField("field2", field2Value,Field.Store.YES);
while ((line = br.readLine()) != null) {
//String[] lineTokens = line.split("\|");
field1.setStringValue("field1Value");
doc.add(field1);
field2.setStringValue("field2Value");
doc.add(field2);
writer.addDocument(doc);
}
br.close();
} catch (FileNotFoundException fnfe) {
}
您不需要在每次迭代时都将字段添加到文档中。添加一次字段后,您需要做的就是更改字段值,然后将更改后的文档写入索引,如下所示:
Document doc = new Document();
Field field1 = new TextField("field1", field1Value, Field.Store.YES);
doc.add(field1);
Field field2 = new StringField("field2", field2Value,Field.Store.YES);
doc.add(field2);
while ((line = br.readLine()) != null) {
field1.setStringValue("field1Value");
field2.setStringValue("field2Value");
writer.addDocument(doc);
}