Elasticsearch 查询字符串查询不适用于同义词分析器
Elasticsearch Query String Query not working with synonym analyzer
我正在尝试使用同义词配置弹性搜索。
这些是我的设置:
"analysis": {
"analyzer": {
"category_synonym": {
"tokenizer": "whitespace",
"filter": [
"synonym_filter"
]
}
},
"filter": {
"synonym_filter": {
"type": "synonym",
"synonyms_path": "synonyms.txt"
}
}
}
映射配置:
"category": {
"properties": {
"name": {
"type":"string",
"search_analyzer" : "category_synonym",
"index_analyzer" : "standard",
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
还有我的同义词列表
film => video,
ooh => panels , poster,
commercial => advertisement,
print => magazine
我必须说我正在使用 Elasticsearch Java API。
我正在使用 QueryBuilders.queryStringQuery
因为这是我根据我的请求设置分析器的唯一方法。
所以,当我制作时:
QueryBuilders.queryStringQuery("name:film").analyzer(analyzer)
它return是我
[
{
"id": 71,
"name": "Pitch video",
"description": "... ",
"parent": null
},
{
"id": 25,
"name": "Video",
"description": "... ",
"parent": null
}
]
这对我来说很完美,但是当我这样调用时
QueryBuilders.queryStringQuery("name:vid").analyzer(analyzer)
我希望它应该 return 相同的对象,但什么都没有:[]
所以,我在 queryStringQuery
中添加了星号:
QueryBuilders.queryStringQuery("name:vid*").analyzer(analyzer)
效果很好,但现在
QueryBuilders.queryStringQuery("name:film*").analyzer(analyzer)
return是我[]
那么,我如何配置弹性搜索,使其在搜索 video
、vid
、film
和 [=25= 时会 return 相同的对象]?
提前致谢!
嗯,我不认为 Elasticsearch 会知道 "translate" fil
到 vid
:-)。所以,我认为你需要 edgeNGram
s 为此,无论是在索引还是搜索时。
PUT test
{
"settings": {
"analysis": {
"analyzer": {
"category_synonym": {
"tokenizer": "whitespace",
"filter": [
"synonym_filter",
"my_edgeNGram_filter"
]
},
"standard_edgeNGram": {
"tokenizer": "standard",
"filter": [
"lowercase",
"synonym_filter",
"my_edgeNGram_filter"
]
}
},
"filter": {
"synonym_filter": {
"type": "synonym",
"synonyms_path": "synonyms.txt"
},
"my_edgeNGram_filter": {
"type": "edgeNGram",
"min_gram": 2,
"max_gram": 8
}
}
}
},
"mappings": {
"test": {
"properties": {
"name": {
"type": "string",
"analyzer": "category_synonym",
"index_analyzer": "standard_edgeNGram",
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
}
}
POST test/test/1
{"name": "Pitch video"}
POST test/test/2
{"name": "Video"}
GET /test/test/_search
{
"query": {
"query_string": {
"query": "name:fil"
}
}
}
我正在尝试使用同义词配置弹性搜索。
这些是我的设置:
"analysis": {
"analyzer": {
"category_synonym": {
"tokenizer": "whitespace",
"filter": [
"synonym_filter"
]
}
},
"filter": {
"synonym_filter": {
"type": "synonym",
"synonyms_path": "synonyms.txt"
}
}
}
映射配置:
"category": {
"properties": {
"name": {
"type":"string",
"search_analyzer" : "category_synonym",
"index_analyzer" : "standard",
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
还有我的同义词列表
film => video,
ooh => panels , poster,
commercial => advertisement,
print => magazine
我必须说我正在使用 Elasticsearch Java API。
我正在使用 QueryBuilders.queryStringQuery
因为这是我根据我的请求设置分析器的唯一方法。
所以,当我制作时:
QueryBuilders.queryStringQuery("name:film").analyzer(analyzer)
它return是我
[
{
"id": 71,
"name": "Pitch video",
"description": "... ",
"parent": null
},
{
"id": 25,
"name": "Video",
"description": "... ",
"parent": null
}
]
这对我来说很完美,但是当我这样调用时
QueryBuilders.queryStringQuery("name:vid").analyzer(analyzer)
我希望它应该 return 相同的对象,但什么都没有:[]
所以,我在 queryStringQuery
中添加了星号:
QueryBuilders.queryStringQuery("name:vid*").analyzer(analyzer)
效果很好,但现在
QueryBuilders.queryStringQuery("name:film*").analyzer(analyzer)
return是我[]
那么,我如何配置弹性搜索,使其在搜索 video
、vid
、film
和 [=25= 时会 return 相同的对象]?
提前致谢!
嗯,我不认为 Elasticsearch 会知道 "translate" fil
到 vid
:-)。所以,我认为你需要 edgeNGram
s 为此,无论是在索引还是搜索时。
PUT test
{
"settings": {
"analysis": {
"analyzer": {
"category_synonym": {
"tokenizer": "whitespace",
"filter": [
"synonym_filter",
"my_edgeNGram_filter"
]
},
"standard_edgeNGram": {
"tokenizer": "standard",
"filter": [
"lowercase",
"synonym_filter",
"my_edgeNGram_filter"
]
}
},
"filter": {
"synonym_filter": {
"type": "synonym",
"synonyms_path": "synonyms.txt"
},
"my_edgeNGram_filter": {
"type": "edgeNGram",
"min_gram": 2,
"max_gram": 8
}
}
}
},
"mappings": {
"test": {
"properties": {
"name": {
"type": "string",
"analyzer": "category_synonym",
"index_analyzer": "standard_edgeNGram",
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
}
}
POST test/test/1
{"name": "Pitch video"}
POST test/test/2
{"name": "Video"}
GET /test/test/_search
{
"query": {
"query_string": {
"query": "name:fil"
}
}
}