短语建议器 Elasticsearch 不纠正两个词
Phrase suggester Elasticsearch not correcting two words
我有以下映射到我的短语建议者:
{
"settings": {
"analysis": {
"analyzer": {
"suggests_analyzer": {
"tokenizer": "standard",
"filter": [
"lowercase",
"asciifolding",
"shingle_filter"
],
"type": "custom"
}
},
"filter": {
"shingle_filter": {
"min_shingle_size": 2,
"max_shingle_size": 6,
"type": "shingle"
}
}
}
},
"mappings": {
"sample_data": {
"properties": {
"name": {
"type": "string",
"analyzer": "suggests_analyzer"
}
}
}
}
}
我的索引中包含 "lung cancer"、"colorectal cancer"、"breast cancer"。但是,当我查询一个拼写错误的查询时,当我使用整理功能时,两个词都被拼错,如 "lhng cancar" returns 结果为零。我的示例查询如下。
{
"suggest": {
"text": "lhng cancar",
"simple_phrase": {
"phrase": {
"field": "name",
"size": 5,
"real_word_error_likelihood": 0.95,
"max_errors": 0.5,
"direct_generator": [
{
"field": "name",
"suggest_mode": "always",
"size": 5
}
],
"collate": {
"query": {
"inline": {
"match_phrase": {
"{{field_name}}": "{{suggestion}}"
}
}
},
"params": {
"field_name": "name"
},
"prune": false
}
}
}
},
"size": 0
}
上述查询的响应是:
{
"took": 17,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1868381,
"max_score": 0,
"hits": []
},
"suggest": {
"simple_phrase": [
{
"text": "lhng cancar",
"offset": 0,
"length": 11,
"options": []
}
]
}
}
需要在查询中进行哪些更改才能获得建议中 "lung cancer" 的预期结果?
您必须将 max_errors
提高到 0.8 或更多。
这里给出了相同的答案
提高 max_errors: 2
的参数解决了我的问题。
我有以下映射到我的短语建议者:
{
"settings": {
"analysis": {
"analyzer": {
"suggests_analyzer": {
"tokenizer": "standard",
"filter": [
"lowercase",
"asciifolding",
"shingle_filter"
],
"type": "custom"
}
},
"filter": {
"shingle_filter": {
"min_shingle_size": 2,
"max_shingle_size": 6,
"type": "shingle"
}
}
}
},
"mappings": {
"sample_data": {
"properties": {
"name": {
"type": "string",
"analyzer": "suggests_analyzer"
}
}
}
}
}
我的索引中包含 "lung cancer"、"colorectal cancer"、"breast cancer"。但是,当我查询一个拼写错误的查询时,当我使用整理功能时,两个词都被拼错,如 "lhng cancar" returns 结果为零。我的示例查询如下。
{
"suggest": {
"text": "lhng cancar",
"simple_phrase": {
"phrase": {
"field": "name",
"size": 5,
"real_word_error_likelihood": 0.95,
"max_errors": 0.5,
"direct_generator": [
{
"field": "name",
"suggest_mode": "always",
"size": 5
}
],
"collate": {
"query": {
"inline": {
"match_phrase": {
"{{field_name}}": "{{suggestion}}"
}
}
},
"params": {
"field_name": "name"
},
"prune": false
}
}
}
},
"size": 0
}
上述查询的响应是:
{
"took": 17,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1868381,
"max_score": 0,
"hits": []
},
"suggest": {
"simple_phrase": [
{
"text": "lhng cancar",
"offset": 0,
"length": 11,
"options": []
}
]
}
}
需要在查询中进行哪些更改才能获得建议中 "lung cancer" 的预期结果?
您必须将 max_errors
提高到 0.8 或更多。
这里给出了相同的答案
提高 max_errors: 2
的参数解决了我的问题。