范围 ElasticSearch 聚合
Range ElasticSearch Aggregation
我需要在 ElasticSearch 中计算管道聚合,但我不知道如何表达它。
每个文档都有一个电子邮件地址和一个金额。我需要输出金额计数的范围桶,按唯一电子邮件分组。
{ "0 - 99": 300, "100 - 400": 100 ...}
基本上是预期的输出(键将在我的应用程序代码中转换),表明 300 封独特的电子邮件在所有文档中累计收到至少 99(数量)。
凭直觉,我希望查询如下所示。但是,范围似乎不是桶聚合(或允许 buckets_path)。
这里正确的做法是什么?
{
aggs: {
users: {
terms: {
field: "email"
},
aggs: {
amount_received: {
sum: {
field: "amount"
}
}
}
},
amount_ranges: {
range: {
buckets_path: "users>amount_received",
ranges: [
{ to: 99.0 },
{ from: 100.0, to: 299.0 },
{ from: 300.0, to: 599.0 },
{ from: 600.0 }
]
}
}
}
}
没有直接执行此操作的管道聚合。但是,我想我想出了一个适合您需求的解决方案,它是这样的。这个想法是重复相同的 terms/sum
聚合,然后对您感兴趣的每个范围使用 bucket_selector
管道聚合。
POST index/_search
{
"size": 0,
"aggs": {
"users_99": {
"terms": {
"field": "email",
"size": 1000
},
"aggs": {
"amount_received": {
"sum": {
"field": "amount"
}
},
"-99": {
"bucket_selector": {
"buckets_path": {
"amountReceived": "amount_received"
},
"script": "params.amountReceived < 100"
}
}
}
},
"users_100_299": {
"terms": {
"field": "email",
"size": 1000
},
"aggs": {
"amount_received": {
"sum": {
"field": "amount"
}
},
"100-299": {
"bucket_selector": {
"buckets_path": {
"amountReceived": "amount_received"
},
"script": "params.amountReceived >= 100 && params.amountReceived < 300"
}
}
}
},
"users_300_599": {
"terms": {
"field": "email",
"size": 1000
},
"aggs": {
"amount_received": {
"sum": {
"field": "amount"
}
},
"300-599": {
"bucket_selector": {
"buckets_path": {
"amountReceived": "amount_received"
},
"script": "params.amountReceived >= 300 && params.amountReceived < 600"
}
}
}
},
"users_600": {
"terms": {
"field": "email",
"size": 1000
},
"aggs": {
"amount_received": {
"sum": {
"field": "amount"
}
},
"600": {
"bucket_selector": {
"buckets_path": {
"amountReceived": "amount_received"
},
"script": "params.amountReceived >= 600"
}
}
}
}
}
}
在结果中,users_99
中的桶数将是数量小于 99 的唯一电子邮件的数量。同样,users_100_299
将包含与数量一样多的桶数量在 100 到 300 之间的独特电子邮件。依此类推...
我需要在 ElasticSearch 中计算管道聚合,但我不知道如何表达它。
每个文档都有一个电子邮件地址和一个金额。我需要输出金额计数的范围桶,按唯一电子邮件分组。
{ "0 - 99": 300, "100 - 400": 100 ...}
基本上是预期的输出(键将在我的应用程序代码中转换),表明 300 封独特的电子邮件在所有文档中累计收到至少 99(数量)。
凭直觉,我希望查询如下所示。但是,范围似乎不是桶聚合(或允许 buckets_path)。
这里正确的做法是什么?
{
aggs: {
users: {
terms: {
field: "email"
},
aggs: {
amount_received: {
sum: {
field: "amount"
}
}
}
},
amount_ranges: {
range: {
buckets_path: "users>amount_received",
ranges: [
{ to: 99.0 },
{ from: 100.0, to: 299.0 },
{ from: 300.0, to: 599.0 },
{ from: 600.0 }
]
}
}
}
}
没有直接执行此操作的管道聚合。但是,我想我想出了一个适合您需求的解决方案,它是这样的。这个想法是重复相同的 terms/sum
聚合,然后对您感兴趣的每个范围使用 bucket_selector
管道聚合。
POST index/_search
{
"size": 0,
"aggs": {
"users_99": {
"terms": {
"field": "email",
"size": 1000
},
"aggs": {
"amount_received": {
"sum": {
"field": "amount"
}
},
"-99": {
"bucket_selector": {
"buckets_path": {
"amountReceived": "amount_received"
},
"script": "params.amountReceived < 100"
}
}
}
},
"users_100_299": {
"terms": {
"field": "email",
"size": 1000
},
"aggs": {
"amount_received": {
"sum": {
"field": "amount"
}
},
"100-299": {
"bucket_selector": {
"buckets_path": {
"amountReceived": "amount_received"
},
"script": "params.amountReceived >= 100 && params.amountReceived < 300"
}
}
}
},
"users_300_599": {
"terms": {
"field": "email",
"size": 1000
},
"aggs": {
"amount_received": {
"sum": {
"field": "amount"
}
},
"300-599": {
"bucket_selector": {
"buckets_path": {
"amountReceived": "amount_received"
},
"script": "params.amountReceived >= 300 && params.amountReceived < 600"
}
}
}
},
"users_600": {
"terms": {
"field": "email",
"size": 1000
},
"aggs": {
"amount_received": {
"sum": {
"field": "amount"
}
},
"600": {
"bucket_selector": {
"buckets_path": {
"amountReceived": "amount_received"
},
"script": "params.amountReceived >= 600"
}
}
}
}
}
}
在结果中,users_99
中的桶数将是数量小于 99 的唯一电子邮件的数量。同样,users_100_299
将包含与数量一样多的桶数量在 100 到 300 之间的独特电子邮件。依此类推...