带范围的 Lucene 查询

Lucene query with range

我正在为一些项目编制索引,其中包括 titlecost 作为字段。成本是双重价值。 我正在准备一个查询,例如:

(title:item~0.8) AND (cost:[0.0 TO 200.0])

解析后,query.toString()如下所示:

+title:item~0 +cost:[0.0 TO 200.0]

从返回的结果来看,显然没有考虑cost 我确定 cost 已编入索引,因为我可以检索它。 索引代码:

public void index(Set<Item> items) throws IOException {
    String path = "D:\lucenedata\myproj";
    Directory fsDir = FSDirectory.open(new File(path));
    StandardAnalyzer analyzer = new StandardAnalyzer();
    IndexWriterConfig iwConf = new IndexWriterConfig(Version.LUCENE_4_10_3, analyzer);
    iwConf.setOpenMode(IndexWriterConfig.OpenMode.CREATE);
    IndexWriter indexWriter = new IndexWriter(fsDir, iwConf);
    for (Item item : items) {
        Document d = new Document();
        if (item.getCost() != null) {
            d.add(new DoubleField("cost", item.getCost().doubleValue(), Store.YES));
        }
        d.add(new TextField("title", item.getTitle(), Store.YES));
        indexWriter.addDocument(d);
    }
    indexWriter.commit();
    indexWriter.close();
    System.out.println("Indexed " + items.size() + " items");
}

QueryParser 不生成数字范围查询。因此,您正在搜索成本在字典序上而不是数字上介于 0.0 和 200.0 之间的值。此外,数字字段在索引中被转换为前缀编码形式,因此您的结果将非常不可预测。

最好使用 NumericRangeQuery, instead of the QueryParser, and them combine them with your parsed query using a BooleanQuery 通过查询 API 生成数值范围。类似于:

Query parsedQuery = parser.parse(title:item~0.8);
Query costQuery = NumericRangeQuery.newDoubleRange("cost", 0.00, 200.0, true, true);
BooleanQuery finalQuery = new BooleanQuery();
finalQuery.add(new BooleanClause(parsedQuery, BooleanClause.Occur.MUST));
finalQuery.add(new BooleanClause(costQuery, BooleanClause.Occur.MUST));

我最终继承了 QueryParser,然后在遇到 cost 时创建了一个 NumericRange。效果不错。

public class WebSearchQueryParser extends QueryParser {

    public WebSearchQueryParser(String f, Analyzer a) {
        super(f, a);
    }

    protected Query getRangeQuery(final String field, final String min, final String max,
            final boolean startInclusive, final boolean endInclusive) throws ParseException {
        if ("cost".equals(field)) {
            return NumericRangeQuery.newDoubleRange(field, Double.parseDouble(min), Double.parseDouble(max),
                    startInclusive, endInclusive);
        }
        return super.getRangeQuery(field, min, max, startInclusive, endInclusive);
    }
}

然后初始化:

QueryParser queryParser = new WebSearchQueryParser("title", new StandardAnalyzer());

并像以前一样解析我的查询 (title:item~0.8) AND (cost:[0.0 TO 200.0])