如何在 Elasticsearch 中组合多个聚合?
How to combine multiple aggs in Elasticsearch?
我想统计每个产品一天内每个IP的访问次数
一个索引(nginx-access-log)有3个参数:
- 时间戳
- 客户端提示
- product_id
我知道date_histogram可以参考https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-datehistogram-aggregation.html.
但是我不知道如何组合aggs来构建脚本。
更新:
我使用下面的脚本来搜索
GET log-nginx_access*/_search
{
"aggs": {
"by_day": {
"date_histogram": {
"field": "timestamp",
"interval": "1d",
"time_zone": "Asia/Shanghai",
"min_doc_count": 1
},
"aggs": {
"by_product": {
"terms": {
"field": "uri_args.product_id",
"size": 100
}
},
"aggs": {
"by_ip": {
"terms": {
"field": "clientip"
}
}
}
}
}
}
}
出现错误:
{
"error": {
"root_cause": [
{
"type": "unknown_named_object_exception",
"reason": "Unknown BaseAggregationBuilder [by_ip]",
"line": 18,
"col": 20
}
],
"type": "unknown_named_object_exception",
"reason": "Unknown BaseAggregationBuilder [by_ip]",
"line": 18,
"col": 20
},
"status": 400
}
也许我们可以使用 terms
和 date_histogram
聚合
GET /{index_name}
{
"aggs": {
"by_day": {
"date_histogram": {
"field" : "timestamp",
"interval" : "day"
},
"aggs": {
"by_product": {
"terms" : {
"field" : "product",
"size": 100 // 100 unique products will be aggregated
},
"aggs": {
"by_ip": {
"terms" : {
"field" : "ip"
}
}
}
}
}
}
}
}
terms
聚合的响应有 doc_count
字段,可能会满足您的要求。我们必须考虑的一件事是 size
参数来定义聚合的唯一性。
我想统计每个产品一天内每个IP的访问次数
一个索引(nginx-access-log)有3个参数:
- 时间戳
- 客户端提示
- product_id
我知道date_histogram可以参考https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-datehistogram-aggregation.html.
但是我不知道如何组合aggs来构建脚本。
更新:
我使用下面的脚本来搜索
GET log-nginx_access*/_search
{
"aggs": {
"by_day": {
"date_histogram": {
"field": "timestamp",
"interval": "1d",
"time_zone": "Asia/Shanghai",
"min_doc_count": 1
},
"aggs": {
"by_product": {
"terms": {
"field": "uri_args.product_id",
"size": 100
}
},
"aggs": {
"by_ip": {
"terms": {
"field": "clientip"
}
}
}
}
}
}
}
出现错误:
{
"error": {
"root_cause": [
{
"type": "unknown_named_object_exception",
"reason": "Unknown BaseAggregationBuilder [by_ip]",
"line": 18,
"col": 20
}
],
"type": "unknown_named_object_exception",
"reason": "Unknown BaseAggregationBuilder [by_ip]",
"line": 18,
"col": 20
},
"status": 400
}
也许我们可以使用 terms
和 date_histogram
聚合
GET /{index_name}
{
"aggs": {
"by_day": {
"date_histogram": {
"field" : "timestamp",
"interval" : "day"
},
"aggs": {
"by_product": {
"terms" : {
"field" : "product",
"size": 100 // 100 unique products will be aggregated
},
"aggs": {
"by_ip": {
"terms" : {
"field" : "ip"
}
}
}
}
}
}
}
}
terms
聚合的响应有 doc_count
字段,可能会满足您的要求。我们必须考虑的一件事是 size
参数来定义聚合的唯一性。