按文本字段对 elasticsearch 聚合桶进行排序

Sort elasticsearch aggregation buckets by text field

我正在尝试对 elasticsearch 聚合的结果桶进行排序。 我有一大套文件:

"mappings": {
    "properties": {
        "price": {
            "type": "double"
        },
        "product_name": {
            "type": "text"
        },
        "product_id": {
            "type": "keyword"
        },
        "timestamp": {
            "type": "date"
        }
    }
}

我目前正在做的是使用 compositetop_hits 聚合获取每个 product_id 的最新销售:

{
    "query": {
        "bool": {
            "filter": [
                {
                    "range": {
                        "timestamp": {
                            "gte": "2019-10-25T00:00:00Z",
                            "lte": "2019-10-26T00:00:00Z"
                        }
                    }
                }
            ]
        }
    },
    "aggs": {
        "distinct_products": {
            "composite": {
                "sources": [
                    {
                        "distinct_ids": {
                            "terms": {
                                "field": "product_id"
                            }
                        }
                    }
                ],
                "size": 10000
            },
            "aggs": {
                "last_timestamp": {
                    "top_hits": {
                        "sort": {
                            "timestamp": {
                                "order": "desc"
                            }
                        },
                        "size": 1
                    }
                }
            }
        }
    }
}

现在我想按任意字段对生成的桶进行排序。 如果我想按 price 排序,我可以使用 中的解决方案 通过添加一个 max 聚合,从每个桶中提取 product_price 字段,并在末尾添加一个 bucket_sort 聚合,它将对 max:

的结果进行排序
{
    "query": {
        "bool": {
            "filter": [
                {
                    "range": {
                        "timestamp": {
                            "gte": "2019-10-25T00:00:00Z",
                            "lte": "2019-10-26T00:00:00Z"
                        }
                    }
                }
            ]
        }
    },
    "aggs": {
        "distinct_products": {
            "composite": {
                "sources": [
                    {
                        "distinct_ids": {
                            "terms": {
                                "field": "product_id"
                            }
                        }
                    }
                ],
                "size": 10000
            },
            "aggs": {
                "last_timestamp": {
                    "top_hits": {
                        "sort": {
                            "timestamp": {
                                "order": "desc"
                            }
                        },
                        "size": 1,
                        "_source": {
                            "excludes": []
                        }
                    }
                },
                "latest_sell": {
                    "max": {
                        "field": "product_price"
                    }
                },
                "latest_sell_secondary": {
                    "max": {
                        "field": "timestamp"
                    }
                },
                "sort_sells": {
                    "bucket_sort": {
                        "sort": {
                            "latest_sell": {
                                "order": "desc"
                            },
                            "latest_sell_secondary": {
                                "order": "desc"
                            }
                        },
                        "from": 0,
                        "size": 10000
                    }
                }
            }
        }
    }
}

如果我想按字母顺序 product_name 而不是 product_price 排序,我不能使用 max 聚合,因为它只适用于数字字段。

如何按文本字段对 last_timestamp 个存储桶(每个存储桶只有一个文档)进行排序?

我使用的elasticsearch版本是7.2.0。

来自文档

Each bucket may be sorted based on its _key, _count or its sub-aggregations

您可以在聚合方面使用 product_name.keyword 而不是产品 ID,并根据键

进行排序
"order": { "_key" : "asc" }