索引上的 Elasticsearch 排序设置给出奇怪的结果
Elasticsearch sort settings on index giving strange results
我的索引设置如下:
PUT items
{
"settings": {
"index": {
"sort.field": ["popularity", "title_keyword"],
"sort.order": ["desc", "asc"]
},
"analysis": {
"analyzer": {
"autocomplete": {
"tokenizer": "autocomplete",
"filter": [
"lowercase"
]
},
"autocomplete_search": {
"tokenizer": "lowercase"
}
},
"tokenizer": {
"autocomplete": {
"type": "edge_ngram",
"min_gram": 2,
"max_gram": 15,
"token_chars": [
"letter"
]
}
}
}
},
"mappings": {
"properties": {
"title": {
"type": "text",
"analyzer": "autocomplete",
"search_analyzer": "autocomplete_search"
},
"title_keyword": {
"type": "keyword"
},
"popularity": {
"type": "integer"
},
"visibility": {
"type": "keyword"
}
}
}
}
具有以下数据:
POST items/_doc/1
{
"title": "The Arbor",
"popularity": 5,
"title_keyword": "The Arbor",
"visibility": "public"
}
POST items/_doc/2
{
"title": "The Canon",
"popularity": 10,
"title_keyword": "The Canon",
"visibility": "public"
}
POST items/_doc/3
{
"title": "The Brew",
"popularity": 15,
"title_keyword": "The Brew",
"visibility": "public"
}
我运行这个查询数据:
GET items/_search
{
"size": 3,
"query": {
"bool": {
"must": [
{
"match": {
"title": {
"query": "the",
"operator": "and"
}
}
},
{
"match": {
"visibility": "public"
}
}
]
}
},
"highlight": {
"pre_tags": ["<mark>"],
"post_tags": ["</mark>"],
"fields": {
"title": {}
}
}
}
the
这个词的记录似乎是正确匹配的,但是排序好像不行。我希望它按定义的受欢迎程度排序,结果将按 The Arbor
、The Brew
、The Canon
的顺序排列,但我得到的结果如下:
{
"took" : 11,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : 0.27381438,
"hits" : [
{
"_index" : "items",
"_type" : "_doc",
"_id" : "3",
"_score" : 0.27381438,
"_source" : {
"title" : "The Brew",
"popularity" : 15,
"title_keyword" : "The Brew",
"visibility" : "public"
},
"highlight" : {
"title" : [
"<mark>The</mark> Brew"
]
}
},
{
"_index" : "items",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.26392496,
"_source" : {
"title" : "The Arbor",
"popularity" : 5,
"title_keyword" : "The Arbor",
"visibility" : "public"
},
"highlight" : {
"title" : [
"<mark>The</mark> Arbor"
]
}
},
{
"_index" : "items",
"_type" : "_doc",
"_id" : "2",
"_score" : 0.26392496,
"_source" : {
"title" : "The Canon",
"popularity" : 10,
"title_keyword" : "The Canon",
"visibility" : "public"
},
"highlight" : {
"title" : [
"<mark>The</mark> Canon"
]
}
}
]
}
}
创建索引时定义排序字段和顺序,设置下是否自动排序结果?它似乎是按分数而不是受欢迎程度排序的。如果我在查询中包含排序选项,它会返回正确的结果:
GET items/_search
{
"size": 3,
"sort": [
{
"popularity": {
"order": "desc"
}
},
{
"title_keyword": {
"order": "asc"
}
}
],
"query": {
"bool": {
"must": [
{
"match": {
"title": {
"query": "the",
"operator": "and"
}
}
},
{
"match": {
"visibility": "public"
}
}
]
}
},
"highlight": {
"pre_tags": ["<mark>"],
"post_tags": ["</mark>"],
"fields": {
"title": {}
}
}
}
我了解到像这样在查询中包含排序是低效的,因此将其包含在设置中。我在创建索引时没有做某事以使其默认按受欢迎程度排序吗?在查询中包含排序选项会导致查询效率低下吗?或者我真的需要在每个查询中都包含它吗?
希望这是有道理的!谢谢
索引排序定义了段在分片中的排序方式,这与搜索结果的排序无关。您可以使用排序索引,如果您经常有使用相同条件排序的搜索,那么索引排序可以加快搜索速度。
如果您的搜索排序与索引不同或根本没有排序,则索引排序不相关。
请查看 documentation for index sorting and especially the part that explains how index sorting 已被使用。
我的索引设置如下:
PUT items
{
"settings": {
"index": {
"sort.field": ["popularity", "title_keyword"],
"sort.order": ["desc", "asc"]
},
"analysis": {
"analyzer": {
"autocomplete": {
"tokenizer": "autocomplete",
"filter": [
"lowercase"
]
},
"autocomplete_search": {
"tokenizer": "lowercase"
}
},
"tokenizer": {
"autocomplete": {
"type": "edge_ngram",
"min_gram": 2,
"max_gram": 15,
"token_chars": [
"letter"
]
}
}
}
},
"mappings": {
"properties": {
"title": {
"type": "text",
"analyzer": "autocomplete",
"search_analyzer": "autocomplete_search"
},
"title_keyword": {
"type": "keyword"
},
"popularity": {
"type": "integer"
},
"visibility": {
"type": "keyword"
}
}
}
}
具有以下数据:
POST items/_doc/1
{
"title": "The Arbor",
"popularity": 5,
"title_keyword": "The Arbor",
"visibility": "public"
}
POST items/_doc/2
{
"title": "The Canon",
"popularity": 10,
"title_keyword": "The Canon",
"visibility": "public"
}
POST items/_doc/3
{
"title": "The Brew",
"popularity": 15,
"title_keyword": "The Brew",
"visibility": "public"
}
我运行这个查询数据:
GET items/_search
{
"size": 3,
"query": {
"bool": {
"must": [
{
"match": {
"title": {
"query": "the",
"operator": "and"
}
}
},
{
"match": {
"visibility": "public"
}
}
]
}
},
"highlight": {
"pre_tags": ["<mark>"],
"post_tags": ["</mark>"],
"fields": {
"title": {}
}
}
}
the
这个词的记录似乎是正确匹配的,但是排序好像不行。我希望它按定义的受欢迎程度排序,结果将按 The Arbor
、The Brew
、The Canon
的顺序排列,但我得到的结果如下:
{
"took" : 11,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : 0.27381438,
"hits" : [
{
"_index" : "items",
"_type" : "_doc",
"_id" : "3",
"_score" : 0.27381438,
"_source" : {
"title" : "The Brew",
"popularity" : 15,
"title_keyword" : "The Brew",
"visibility" : "public"
},
"highlight" : {
"title" : [
"<mark>The</mark> Brew"
]
}
},
{
"_index" : "items",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.26392496,
"_source" : {
"title" : "The Arbor",
"popularity" : 5,
"title_keyword" : "The Arbor",
"visibility" : "public"
},
"highlight" : {
"title" : [
"<mark>The</mark> Arbor"
]
}
},
{
"_index" : "items",
"_type" : "_doc",
"_id" : "2",
"_score" : 0.26392496,
"_source" : {
"title" : "The Canon",
"popularity" : 10,
"title_keyword" : "The Canon",
"visibility" : "public"
},
"highlight" : {
"title" : [
"<mark>The</mark> Canon"
]
}
}
]
}
}
创建索引时定义排序字段和顺序,设置下是否自动排序结果?它似乎是按分数而不是受欢迎程度排序的。如果我在查询中包含排序选项,它会返回正确的结果:
GET items/_search
{
"size": 3,
"sort": [
{
"popularity": {
"order": "desc"
}
},
{
"title_keyword": {
"order": "asc"
}
}
],
"query": {
"bool": {
"must": [
{
"match": {
"title": {
"query": "the",
"operator": "and"
}
}
},
{
"match": {
"visibility": "public"
}
}
]
}
},
"highlight": {
"pre_tags": ["<mark>"],
"post_tags": ["</mark>"],
"fields": {
"title": {}
}
}
}
我了解到像这样在查询中包含排序是低效的,因此将其包含在设置中。我在创建索引时没有做某事以使其默认按受欢迎程度排序吗?在查询中包含排序选项会导致查询效率低下吗?或者我真的需要在每个查询中都包含它吗?
希望这是有道理的!谢谢
索引排序定义了段在分片中的排序方式,这与搜索结果的排序无关。您可以使用排序索引,如果您经常有使用相同条件排序的搜索,那么索引排序可以加快搜索速度。
如果您的搜索排序与索引不同或根本没有排序,则索引排序不相关。
请查看 documentation for index sorting and especially the part that explains how index sorting 已被使用。