Hibernate Search 从 5.11.9 迁移到 6.0.6 - 未应用分析器
Hibernate Search migration from 5.11.9 to 6.0.6 - analyzer not being applied
我在迁移到较新版本的 Hibernate Search 的过程中遇到了一些问题,主要与未应用动态字段和关联的分析器有关。
首先,我正在构建分析器:
private void buildLocalizedAnalyzer(ElasticsearchAnalysisConfigurationContext builder) {
SupportedLanguage.stream().forEach(
supportedLanguage ->
{
builder.analyzer(supportedLanguage.getDescription() + "Analyzer").custom()
.tokenizer(STANDARD)
.tokenFilters(LOWERCASE, EDGE_NGRAM_3,
supportedLanguage.getDescription() + "Stemmer",
supportedLanguage.getDescription() + "Stop"
);
builder.tokenFilter(supportedLanguage.getDescription() + "Stemmer")
.type("stemmer").param("language", supportedLanguage.getDescription());
builder.tokenFilter(supportedLanguage.getDescription() + "Stop")
.type("stop").param("stopwords", "_" + supportedLanguage.getDescription() + "_");
}
);
}
然后,使用 属性 活页夹和桥梁,我正在为必填字段编制索引:
@Override
public void bind(PropertyBindingContext context) {
context.dependencies().useRootOnly();
var rootField = context.indexSchemaElement()
.objectField(context.bridgedElement().name());
SupportedLanguage.stream().forEach(
supportedLanguage -> context
.indexSchemaElement()
.fieldTemplate("localized_analyzer_" + supportedLanguage.name().toLowerCase(), f -> f
.asString()
.analyzer(supportedLanguage.getDescription() + "Analyzer"))
.matchingPathGlob("properties.search_" + supportedLanguage.name().toLowerCase() + "_*"));
context.bridge(List.class, new PropertyValueBinder.Bridge(rootField.toReference()));
}
在 属性 桥中:
private void indexEnumPropertiesForLocalizedSearch(DocumentElement target,
PropertyValueEnum propertyValue,
EnumValue enumValue) {
var fieldName = PREFIX_SEARCH + DEFAULT + DELIMITER + propertyValue.getProperty().getCode();
var indexedValue = ((EnumValueString) enumValue).getValue();
target.addValue(fieldName, indexedValue);
enumValue.getTranslations().forEach((language, translation) -> {
var fieldNameTranslated = PREFIX_SEARCH + language.getCode() + DELIMITER + propertyValue.getProperty().getCode();
var indexedValueTranslated = translation.getValue();
target.addValue(fieldNameTranslated, indexedValueTranslated);
});
}
但是当我检索术语向量时,没有应用任何分析器并且搜索不起作用:
_termvectors/8?fields=properties.search_en_category
{
"_index": "product-000001",
"_type": "_doc",
"_id": "8",
"_version": 1,
"found": true,
"took": 0,
"term_vectors": {
"properties.search_en_category": {
"field_statistics": {
"sum_doc_freq": 4,
"doc_count": 4,
"sum_ttf": 4
},
"terms": {
"Category three": {
"term_freq": 1,
"tokens": [
{
"position": 0,
"start_offset": 0,
"end_offset": 14
}
]
}
}
}
}
}
我在 configuration/indexing 过程中遗漏了什么吗?
提前致谢
您确定在测试前删除并重新创建索引吗?字段模板是模式的一部分,因此您需要在它们生效之前创建模式。参见 https://docs.jboss.org/hibernate/stable/search/reference/en-US/html_single/#mapper-orm-schema-management
如果这不是问题,请提供您的 binder/bridge 的完整代码。
我在迁移到较新版本的 Hibernate Search 的过程中遇到了一些问题,主要与未应用动态字段和关联的分析器有关。
首先,我正在构建分析器:
private void buildLocalizedAnalyzer(ElasticsearchAnalysisConfigurationContext builder) {
SupportedLanguage.stream().forEach(
supportedLanguage ->
{
builder.analyzer(supportedLanguage.getDescription() + "Analyzer").custom()
.tokenizer(STANDARD)
.tokenFilters(LOWERCASE, EDGE_NGRAM_3,
supportedLanguage.getDescription() + "Stemmer",
supportedLanguage.getDescription() + "Stop"
);
builder.tokenFilter(supportedLanguage.getDescription() + "Stemmer")
.type("stemmer").param("language", supportedLanguage.getDescription());
builder.tokenFilter(supportedLanguage.getDescription() + "Stop")
.type("stop").param("stopwords", "_" + supportedLanguage.getDescription() + "_");
}
);
}
然后,使用 属性 活页夹和桥梁,我正在为必填字段编制索引:
@Override
public void bind(PropertyBindingContext context) {
context.dependencies().useRootOnly();
var rootField = context.indexSchemaElement()
.objectField(context.bridgedElement().name());
SupportedLanguage.stream().forEach(
supportedLanguage -> context
.indexSchemaElement()
.fieldTemplate("localized_analyzer_" + supportedLanguage.name().toLowerCase(), f -> f
.asString()
.analyzer(supportedLanguage.getDescription() + "Analyzer"))
.matchingPathGlob("properties.search_" + supportedLanguage.name().toLowerCase() + "_*"));
context.bridge(List.class, new PropertyValueBinder.Bridge(rootField.toReference()));
}
在 属性 桥中:
private void indexEnumPropertiesForLocalizedSearch(DocumentElement target,
PropertyValueEnum propertyValue,
EnumValue enumValue) {
var fieldName = PREFIX_SEARCH + DEFAULT + DELIMITER + propertyValue.getProperty().getCode();
var indexedValue = ((EnumValueString) enumValue).getValue();
target.addValue(fieldName, indexedValue);
enumValue.getTranslations().forEach((language, translation) -> {
var fieldNameTranslated = PREFIX_SEARCH + language.getCode() + DELIMITER + propertyValue.getProperty().getCode();
var indexedValueTranslated = translation.getValue();
target.addValue(fieldNameTranslated, indexedValueTranslated);
});
}
但是当我检索术语向量时,没有应用任何分析器并且搜索不起作用:
_termvectors/8?fields=properties.search_en_category
{
"_index": "product-000001",
"_type": "_doc",
"_id": "8",
"_version": 1,
"found": true,
"took": 0,
"term_vectors": {
"properties.search_en_category": {
"field_statistics": {
"sum_doc_freq": 4,
"doc_count": 4,
"sum_ttf": 4
},
"terms": {
"Category three": {
"term_freq": 1,
"tokens": [
{
"position": 0,
"start_offset": 0,
"end_offset": 14
}
]
}
}
}
}
}
我在 configuration/indexing 过程中遗漏了什么吗?
提前致谢
您确定在测试前删除并重新创建索引吗?字段模板是模式的一部分,因此您需要在它们生效之前创建模式。参见 https://docs.jboss.org/hibernate/stable/search/reference/en-US/html_single/#mapper-orm-schema-management
如果这不是问题,请提供您的 binder/bridge 的完整代码。