Lucene LongPoint 范围搜索不起作用
Lucene LongPoint Range search doesn't work
我在 Java 11.
中使用 Lucene 8.2.0
我正在尝试索引一个 Long
值,以便我可以使用范围查询对其进行过滤,例如:+my_range_field:[1 TO 200]
。但是,它的任何变体,甚至 my_range_field:[* TO *]
、returns 0
都会导致这个最小示例。一旦我从中删除 +
使其成为 OR
,我就会得到 2
结果。
所以我想我一定是在索引它的方式上犯了一个错误,但我不知道它可能是什么。
来自LongPoint
JavaDoc:
An indexed long field for fast range filters. If you also need to store the value, you should add a separate StoredField instance.
Finding all documents within an N-dimensional shape or range at search time is efficient. Multiple values for the same field in one document is allowed.
这是我的最小示例:
public static void main(String[] args) {
Directory index = new RAMDirectory();
StandardAnalyzer analyzer = new StandardAnalyzer();
try {
IndexWriter indexWriter = new IndexWriter(index, new IndexWriterConfig(analyzer));
Document document1= new Document();
Document document2= new Document();
document1.add(new LongPoint("my_range_field", 10));
document1.add(new StoredField("my_range_field", 10));
document2.add(new LongPoint("my_range_field", 100));
document2.add(new StoredField("my_range_field", 100));
document1.add(new TextField("my_text_field", "test content 1", Field.Store.YES));
document2.add(new TextField("my_text_field", "test content 2", Field.Store.YES));
indexWriter.deleteAll();
indexWriter.commit();
indexWriter.addDocument(document1);
indexWriter.addDocument(document2);
indexWriter.commit();
indexWriter.close();
QueryParser parser = new QueryParser("text", analyzer);
IndexSearcher indexSearcher = new IndexSearcher(DirectoryReader.open(index));
String luceneQuery = "+my_text_field:test* +my_range_field:[1 TO 200]";
Query query = parser.parse(luceneQuery);
System.out.println(indexSearcher.search(query, 10).totalHits.value);
} catch (IOException e) {
} catch (ParseException e) {
}
}
我找到了问题的解决方案。
我的印象是查询解析器可以正确解析任何查询字符串。好像不是这样。
正在使用
Query rangeQuery = LongPoint.newRangeQuery("my_range_field", 1L, 11L);
Query searchQuery = new WildcardQuery(new Term("my_text_field", "test*"));
Query build = new BooleanQuery.Builder()
.add(searchQuery, BooleanClause.Occur.MUST)
.add(rangeQuery, BooleanClause.Occur.MUST)
.build();
返回了正确的结果。
您需要先使用 StandardQueryParser,然后为解析器提供一个 PointsConfig 映射,本质上是暗示哪些字段将被视为点。您现在将获得 2 个结果。
// Change this line to the following
StandardQueryParser parser = new StandardQueryParser(analyzer);
IndexSearcher indexSearcher = new IndexSearcher(DirectoryReader.open(dir));
/* Added code */
PointsConfig longConfig = new PointsConfig(new DecimalFormat(), Long.class);
Map<String, PointsConfig> pointsConfigMap = new HashMap<>();
pointsConfigMap.put("my_range_field", longConfig);
parser.setPointsConfigMap(pointsConfigMap);
/* End of added code */
String luceneQuery = "+my_text_field:test* +my_range_field:[1 TO 200]";
// Change the query to the following
Query query = parser.parse(luceneQuery, "text");
我在 Java 11.
中使用 Lucene 8.2.0我正在尝试索引一个 Long
值,以便我可以使用范围查询对其进行过滤,例如:+my_range_field:[1 TO 200]
。但是,它的任何变体,甚至 my_range_field:[* TO *]
、returns 0
都会导致这个最小示例。一旦我从中删除 +
使其成为 OR
,我就会得到 2
结果。
所以我想我一定是在索引它的方式上犯了一个错误,但我不知道它可能是什么。
来自LongPoint
JavaDoc:
An indexed long field for fast range filters. If you also need to store the value, you should add a separate StoredField instance. Finding all documents within an N-dimensional shape or range at search time is efficient. Multiple values for the same field in one document is allowed.
这是我的最小示例:
public static void main(String[] args) {
Directory index = new RAMDirectory();
StandardAnalyzer analyzer = new StandardAnalyzer();
try {
IndexWriter indexWriter = new IndexWriter(index, new IndexWriterConfig(analyzer));
Document document1= new Document();
Document document2= new Document();
document1.add(new LongPoint("my_range_field", 10));
document1.add(new StoredField("my_range_field", 10));
document2.add(new LongPoint("my_range_field", 100));
document2.add(new StoredField("my_range_field", 100));
document1.add(new TextField("my_text_field", "test content 1", Field.Store.YES));
document2.add(new TextField("my_text_field", "test content 2", Field.Store.YES));
indexWriter.deleteAll();
indexWriter.commit();
indexWriter.addDocument(document1);
indexWriter.addDocument(document2);
indexWriter.commit();
indexWriter.close();
QueryParser parser = new QueryParser("text", analyzer);
IndexSearcher indexSearcher = new IndexSearcher(DirectoryReader.open(index));
String luceneQuery = "+my_text_field:test* +my_range_field:[1 TO 200]";
Query query = parser.parse(luceneQuery);
System.out.println(indexSearcher.search(query, 10).totalHits.value);
} catch (IOException e) {
} catch (ParseException e) {
}
}
我找到了问题的解决方案。
我的印象是查询解析器可以正确解析任何查询字符串。好像不是这样。
正在使用
Query rangeQuery = LongPoint.newRangeQuery("my_range_field", 1L, 11L);
Query searchQuery = new WildcardQuery(new Term("my_text_field", "test*"));
Query build = new BooleanQuery.Builder()
.add(searchQuery, BooleanClause.Occur.MUST)
.add(rangeQuery, BooleanClause.Occur.MUST)
.build();
返回了正确的结果。
您需要先使用 StandardQueryParser,然后为解析器提供一个 PointsConfig 映射,本质上是暗示哪些字段将被视为点。您现在将获得 2 个结果。
// Change this line to the following
StandardQueryParser parser = new StandardQueryParser(analyzer);
IndexSearcher indexSearcher = new IndexSearcher(DirectoryReader.open(dir));
/* Added code */
PointsConfig longConfig = new PointsConfig(new DecimalFormat(), Long.class);
Map<String, PointsConfig> pointsConfigMap = new HashMap<>();
pointsConfigMap.put("my_range_field", longConfig);
parser.setPointsConfigMap(pointsConfigMap);
/* End of added code */
String luceneQuery = "+my_text_field:test* +my_range_field:[1 TO 200]";
// Change the query to the following
Query query = parser.parse(luceneQuery, "text");