Hibernate Search 5.0 Numeric Lucene 查询 HSEARCH000233 问题
Hibernate Search 5.0 Numeric Lucene Query HSEARCH000233 issue
问题:我们如何为休眠搜索提供包含数字和非数字字段的原始 lucene 查询字符串?
背景:我们最近升级到 HibernateSearch 5.0,现在由于 HibernateSearch 查询解析器(pre-lucene)的更改,我们的许多查询都失败了,并出现以下错误:
The specified query contains a string based sub query which targets the numeric encoded field(s)
在大多数情况下,我们使用 lucene 的文本语法和 MultiFieldQueryParser
将查询传递到 HibernateSearch,因为我们 运行 的查询很复杂。直到 HibernateSearch 5.0,这些都工作得很好。在升级过程中,我们遇到了 HibernateSearch 抛出的异常,这些异常会阻止我们的应用程序进行 运行 曾经有效的查询。我们不明白 为什么 抛出异常或继续前进的最佳方式。
在尝试追查问题时,我尝试以最原始的形式简化有效的方法和无效的方法。 (这是由 HibernateSearch 的 QueryValidationTest 构建的)。
示例:
给定以下实体 class:
@Entity
@Indexed
public static class B {
@Id
@GeneratedValue
private long id;
@Field
private long value;
@Field
private String text;
}
测试 1(我们如何为休眠搜索编写查询:失败):
QueryParser parser = new MultiFieldQueryParser(new String[]{"id","value","num"},new StandardAnalyzer());
Query query = parser.parse("+(value:1 text:test)");
FullTextQuery fullTextQuery = fullTextSession.createFullTextQuery( query, B.class );
fullTextQuery.list();
结果:
org.hibernate.search.exception.SearchException: HSEARCH000233: The specified query '+(value:1 text:test)' contains a string based sub query which targets the numeric encoded field(s) 'value'. Check your query or try limiting the targeted entities.
at org.hibernate.search.query.engine.impl.LazyQueryState.validateQuery(LazyQueryState.java:163)
at org.hibernate.search.query.engine.impl.LazyQueryState.search(LazyQueryState.java:102)
at org.hibernate.search.query.engine.impl.QueryHits.updateTopDocs(QueryHits.java:227)
at org.hibernate.search.query.engine.impl.QueryHits.<init>(QueryHits.java:122)
at org.hibernate.search.query.engine.impl.QueryHits.<init>(QueryHits.java:94)
at org.hibernate.search.query.engine.impl.HSQueryImpl.getQueryHits(HSQueryImpl.java:436)
at org.hibernate.search.query.engine.impl.HSQueryImpl.queryEntityInfos(HSQueryImpl.java:257)
at org.hibernate.search.query.hibernate.impl.FullTextQueryImpl.list(FullTextQueryImpl.java:200)
at org.hibernate.search.test.query.validation.QueryValidationTest.testRawLuceneWithNumericValue(QueryValidationTest.java:156)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.junit.runners.model.FrameworkMethod.runReflectiveCall(FrameworkMethod.java:47)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at org.hibernate.testing.junit4.ExtendedFrameworkMethod.invokeExplosively(ExtendedFrameworkMethod.java:62)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.hibernate.testing.junit4.FailureExpectedHandler.evaluate(FailureExpectedHandler.java:58)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access[=12=]0(ParentRunner.java:53)
at org.junit.runners.ParentRunner.evaluate(ParentRunner.java:229)
at org.hibernate.testing.junit4.BeforeClassCallbackHandler.evaluate(BeforeClassCallbackHandler.java:43)
at org.hibernate.testing.junit4.AfterClassCallbackHandler.evaluate(AfterClassCallbackHandler.java:42)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
测试 2:(使用数字范围变化以同样的方式失败:失败):
QueryParser parser = new MultiFieldQueryParser(new String[]{"id","value","text"},new StandardAnalyzer());
Query query = parser.parse("+(value:[1 TO 1] text:test)");
FullTextQuery fullTextQuery = fullTextSession.createFullTextQuery( query, B.class );
fullTextQuery.list();
测试 3:(使用 lucene 术语:成功)
TermQuery query = new TermQuery( new Term("text", "bar") );
TermQuery nq = new TermQuery( new Term("value", "1") );
BooleanQuery bq = new BooleanQuery();
bq.add(query, Occur.SHOULD);
bq.add(nq, Occur.SHOULD);
FullTextQuery fullTextQuery = fullTextSession.createFullTextQuery( bq, B.class );
注意:完整版的测试用例可以说明我们所看到的内容:https://github.com/abrin/hibernate-search/blob/3fdcc8229f0bfa00329b9d977172fd218d82cac2/orm/src/test/java/org/hibernate/search/test/query/validation/QueryValidationTest.java
谢谢
首先,您的问题的原因是从 Search 5 开始,数字类型被索引为 Lucene 数字字段(与基于字符串的字段相反)。除了性能提升之外,它还允许在不需要填充的情况下对数字字段进行排序。搜索 5 documentation 表示如下:
Prior to Search 5, numeric field encoding was only chosen if
explicitly requested via @NumericField. As of Search 5 this encoding
is automatically chosen for numeric types. To avoid numeric encoding
you can explicitly specify a non numeric field bridge via
@Field.bridge or @FieldBridge. The package
org.hibernate.search.bridge.builtin contains a set of bridges which
encode numbers as strings, for example
org.hibernate.search.bridge.builtin.IntegerBridge.
因此,如果您想坚持原来的行为,您需要确保您的数值仍然作为字符串索引。在您的示例中,value
需要使用 org.hibernate.search.bridge.builtin.LongBridge
进行索引。您可以使用 @FieldBridge
注释实现这一点(您可以忽略 id 大小写,因为文档 id 无论如何都被索引为字符串):
@Field
@FieldBridge(impl = LongBridge.class)
private long value;
关于您的测试场景的一些评论:
- 测试 1:查询解析器只创建基于字符串的查询。 Lucene 不知道哪些字段在此级别上以数字方式索引。数字字段只能是 targeted/searched 使用适当的
NumericRangeQuery
。如果您仍想使用查询解析器,则需要提供自己的子类并自己处理数字字段。另见 - How do I make the QueryParser in Lucene handle numeric ranges?
- 测试 2:同样的问题。即使您使用范围语法
value:[1 TO 1]
,它也只是创建一个 text/string 范围查询。
- 测试 3:我认为这实际上行不通。它可能不会抛出异常,但我很确定如果您查看几个搜索结果,您会注意到
value
项被忽略了。 TermQuery
是基于字符串的,无法在数字编码字段中找到匹配项。另见 Lucene 3.0.3 Numeric term query
问题:我们如何为休眠搜索提供包含数字和非数字字段的原始 lucene 查询字符串?
背景:我们最近升级到 HibernateSearch 5.0,现在由于 HibernateSearch 查询解析器(pre-lucene)的更改,我们的许多查询都失败了,并出现以下错误:
The specified query contains a string based sub query which targets the numeric encoded field(s)
在大多数情况下,我们使用 lucene 的文本语法和 MultiFieldQueryParser
将查询传递到 HibernateSearch,因为我们 运行 的查询很复杂。直到 HibernateSearch 5.0,这些都工作得很好。在升级过程中,我们遇到了 HibernateSearch 抛出的异常,这些异常会阻止我们的应用程序进行 运行 曾经有效的查询。我们不明白 为什么 抛出异常或继续前进的最佳方式。
在尝试追查问题时,我尝试以最原始的形式简化有效的方法和无效的方法。 (这是由 HibernateSearch 的 QueryValidationTest 构建的)。
示例:
给定以下实体 class:
@Entity
@Indexed
public static class B {
@Id
@GeneratedValue
private long id;
@Field
private long value;
@Field
private String text;
}
测试 1(我们如何为休眠搜索编写查询:失败):
QueryParser parser = new MultiFieldQueryParser(new String[]{"id","value","num"},new StandardAnalyzer());
Query query = parser.parse("+(value:1 text:test)");
FullTextQuery fullTextQuery = fullTextSession.createFullTextQuery( query, B.class );
fullTextQuery.list();
结果:
org.hibernate.search.exception.SearchException: HSEARCH000233: The specified query '+(value:1 text:test)' contains a string based sub query which targets the numeric encoded field(s) 'value'. Check your query or try limiting the targeted entities.
at org.hibernate.search.query.engine.impl.LazyQueryState.validateQuery(LazyQueryState.java:163)
at org.hibernate.search.query.engine.impl.LazyQueryState.search(LazyQueryState.java:102)
at org.hibernate.search.query.engine.impl.QueryHits.updateTopDocs(QueryHits.java:227)
at org.hibernate.search.query.engine.impl.QueryHits.<init>(QueryHits.java:122)
at org.hibernate.search.query.engine.impl.QueryHits.<init>(QueryHits.java:94)
at org.hibernate.search.query.engine.impl.HSQueryImpl.getQueryHits(HSQueryImpl.java:436)
at org.hibernate.search.query.engine.impl.HSQueryImpl.queryEntityInfos(HSQueryImpl.java:257)
at org.hibernate.search.query.hibernate.impl.FullTextQueryImpl.list(FullTextQueryImpl.java:200)
at org.hibernate.search.test.query.validation.QueryValidationTest.testRawLuceneWithNumericValue(QueryValidationTest.java:156)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.junit.runners.model.FrameworkMethod.runReflectiveCall(FrameworkMethod.java:47)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at org.hibernate.testing.junit4.ExtendedFrameworkMethod.invokeExplosively(ExtendedFrameworkMethod.java:62)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at org.hibernate.testing.junit4.FailureExpectedHandler.evaluate(FailureExpectedHandler.java:58)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access[=12=]0(ParentRunner.java:53)
at org.junit.runners.ParentRunner.evaluate(ParentRunner.java:229)
at org.hibernate.testing.junit4.BeforeClassCallbackHandler.evaluate(BeforeClassCallbackHandler.java:43)
at org.hibernate.testing.junit4.AfterClassCallbackHandler.evaluate(AfterClassCallbackHandler.java:42)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
测试 2:(使用数字范围变化以同样的方式失败:失败):
QueryParser parser = new MultiFieldQueryParser(new String[]{"id","value","text"},new StandardAnalyzer());
Query query = parser.parse("+(value:[1 TO 1] text:test)");
FullTextQuery fullTextQuery = fullTextSession.createFullTextQuery( query, B.class );
fullTextQuery.list();
测试 3:(使用 lucene 术语:成功)
TermQuery query = new TermQuery( new Term("text", "bar") );
TermQuery nq = new TermQuery( new Term("value", "1") );
BooleanQuery bq = new BooleanQuery();
bq.add(query, Occur.SHOULD);
bq.add(nq, Occur.SHOULD);
FullTextQuery fullTextQuery = fullTextSession.createFullTextQuery( bq, B.class );
注意:完整版的测试用例可以说明我们所看到的内容:https://github.com/abrin/hibernate-search/blob/3fdcc8229f0bfa00329b9d977172fd218d82cac2/orm/src/test/java/org/hibernate/search/test/query/validation/QueryValidationTest.java
谢谢
首先,您的问题的原因是从 Search 5 开始,数字类型被索引为 Lucene 数字字段(与基于字符串的字段相反)。除了性能提升之外,它还允许在不需要填充的情况下对数字字段进行排序。搜索 5 documentation 表示如下:
Prior to Search 5, numeric field encoding was only chosen if explicitly requested via @NumericField. As of Search 5 this encoding is automatically chosen for numeric types. To avoid numeric encoding you can explicitly specify a non numeric field bridge via @Field.bridge or @FieldBridge. The package org.hibernate.search.bridge.builtin contains a set of bridges which encode numbers as strings, for example org.hibernate.search.bridge.builtin.IntegerBridge.
因此,如果您想坚持原来的行为,您需要确保您的数值仍然作为字符串索引。在您的示例中,value
需要使用 org.hibernate.search.bridge.builtin.LongBridge
进行索引。您可以使用 @FieldBridge
注释实现这一点(您可以忽略 id 大小写,因为文档 id 无论如何都被索引为字符串):
@Field
@FieldBridge(impl = LongBridge.class)
private long value;
关于您的测试场景的一些评论:
- 测试 1:查询解析器只创建基于字符串的查询。 Lucene 不知道哪些字段在此级别上以数字方式索引。数字字段只能是 targeted/searched 使用适当的
NumericRangeQuery
。如果您仍想使用查询解析器,则需要提供自己的子类并自己处理数字字段。另见 - How do I make the QueryParser in Lucene handle numeric ranges? - 测试 2:同样的问题。即使您使用范围语法
value:[1 TO 1]
,它也只是创建一个 text/string 范围查询。 - 测试 3:我认为这实际上行不通。它可能不会抛出异常,但我很确定如果您查看几个搜索结果,您会注意到
value
项被忽略了。TermQuery
是基于字符串的,无法在数字编码字段中找到匹配项。另见 Lucene 3.0.3 Numeric term query