如何在 Elasticsearch Bucket 聚合查询中获取文档值而不是文档计数
How to get doc value in Elasticsearch Bucket Aggregation query instead of doc count
我的 ES 索引中有四个文档。
{
"_index": "my-index",
"_type": "_doc",
"_id": "1",
"_score": 1.0,
"_source": {
"@timestamp": "2099-11-15T13:12:00",
"message": "INFO GET /search HTTP/1.1 200 1070000",
"user": {
"id": "test@gmail.com"
}
}
},
{
"_index": "my-index",
"_type": "_doc",
"_id": "2",
"_score": 1.0,
"_source": {
"@timestamp": "2099-11-15T13:15:00",
"message": "Error GET /search HTTP/1.1 200 1070000",
"user": {
"id": "test@gmail.com"
}
}
},
{
"_index": "my-index",
"_type": "_doc",
"_id": "3",
"_score": 1.0,
"_source": {
"@timestamp": "2099-11-15T13:20:00",
"message": "INFO GET /parse HTTP/1.1 200 1070000",
"user": {
"id": "test@gmail.com"
}
}
},
{
"_index": "my-index",
"_type": "_doc",
"_id": "4",
"_score": 1.0,
"_source": {
"@timestamp": "2099-11-15T13:26:00",
"message": "Error GET /parse HTTP/1.1 200 1070000",
"user": {
"id": "test@gmail.com"
}
}
}
我正在使用过滤器编写存储桶聚合查询,以按消息类型(信息或错误)对索引中的所有文档进行分组。在我上面的示例中,索引中有 4 个文档,两个具有“信息”类型的消息,两个具有“错误”类型的消息。
我想编写桶聚合查询,以便我可以按消息类型获取结果组。预期结果应该是两个桶,每个桶有两个文档。但是我的查询只返回每个存储桶的文档计数而不是实际的文档值。
我使用的查询是:
{
"size":0,
"aggs" : {
"messages" : {
"filters" : {
"filters" : {
"info" : { "match" : { "message" : "Info" }},
"error" : { "match" : { "message" : "Error" }}
}
}
}
}
}
上述查询的输出是:
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 2,
"relation": "eq"
},
"max_score": null,
"hits": []
},
"aggregations": {
"messages": {
"buckets": {
"errors": {
"doc_count": 2
},
"info": {
"doc_count": 2
}
}
}
}
}
但我的要求是获取存储桶组内具有字段值的实际文档。有什么方法可以通过过滤器更改存储桶聚合查询,以便我可以获得每个存储桶中包含值的文档?
可以使用top_hits aggregation,获取bucket组内对应的文档
{
"size": 0,
"aggs": {
"messages": {
"filters": {
"filters": {
"info": {
"match": {
"message": "Info"
}
},
"error": {
"match": {
"message": "Error"
}
}
}
},
"aggs": {
"top_filters_hits": {
"top_hits": {
"_source": {
"includes": [
"message",
"user.id"
]
}
}
}
}
}
}
}
搜索结果将是
"aggregations": {
"messages": {
"buckets": {
"error": {
"doc_count": 2,
"top_filters_hits": {
"hits": {
"total": {
"value": 2,
"relation": "eq"
},
"max_score": 1.0,
"hits": [
{
"_index": "67033379",
"_type": "_doc",
"_id": "2",
"_score": 1.0,
"_source": {
"message": "Error GET /search HTTP/1.1 200 1070000",
"user": {
"id": "test@gmail.com"
}
}
},
{
"_index": "67033379",
"_type": "_doc",
"_id": "4",
"_score": 1.0,
"_source": {
"message": "Error GET /parse HTTP/1.1 200 1070000",
"user": {
"id": "test@gmail.com"
}
}
}
]
}
}
},
"info": {
"doc_count": 2,
"top_filters_hits": {
"hits": {
"total": {
"value": 2,
"relation": "eq"
},
"max_score": 1.0,
"hits": [
{
"_index": "67033379",
"_type": "_doc",
"_id": "1",
"_score": 1.0,
"_source": {
"message": "INFO GET /search HTTP/1.1 200 1070000",
"user": {
"id": "test@gmail.com"
}
}
},
{
"_index": "67033379",
"_type": "_doc",
"_id": "3",
"_score": 1.0,
"_source": {
"message": "INFO GET /parse HTTP/1.1 200 1070000",
"user": {
"id": "test@gmail.com"
}
}
}
]
}
}
}
}
}
}
我的 ES 索引中有四个文档。
{
"_index": "my-index",
"_type": "_doc",
"_id": "1",
"_score": 1.0,
"_source": {
"@timestamp": "2099-11-15T13:12:00",
"message": "INFO GET /search HTTP/1.1 200 1070000",
"user": {
"id": "test@gmail.com"
}
}
},
{
"_index": "my-index",
"_type": "_doc",
"_id": "2",
"_score": 1.0,
"_source": {
"@timestamp": "2099-11-15T13:15:00",
"message": "Error GET /search HTTP/1.1 200 1070000",
"user": {
"id": "test@gmail.com"
}
}
},
{
"_index": "my-index",
"_type": "_doc",
"_id": "3",
"_score": 1.0,
"_source": {
"@timestamp": "2099-11-15T13:20:00",
"message": "INFO GET /parse HTTP/1.1 200 1070000",
"user": {
"id": "test@gmail.com"
}
}
},
{
"_index": "my-index",
"_type": "_doc",
"_id": "4",
"_score": 1.0,
"_source": {
"@timestamp": "2099-11-15T13:26:00",
"message": "Error GET /parse HTTP/1.1 200 1070000",
"user": {
"id": "test@gmail.com"
}
}
}
我正在使用过滤器编写存储桶聚合查询,以按消息类型(信息或错误)对索引中的所有文档进行分组。在我上面的示例中,索引中有 4 个文档,两个具有“信息”类型的消息,两个具有“错误”类型的消息。
我想编写桶聚合查询,以便我可以按消息类型获取结果组。预期结果应该是两个桶,每个桶有两个文档。但是我的查询只返回每个存储桶的文档计数而不是实际的文档值。
我使用的查询是:
{
"size":0,
"aggs" : {
"messages" : {
"filters" : {
"filters" : {
"info" : { "match" : { "message" : "Info" }},
"error" : { "match" : { "message" : "Error" }}
}
}
}
}
}
上述查询的输出是:
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 2,
"relation": "eq"
},
"max_score": null,
"hits": []
},
"aggregations": {
"messages": {
"buckets": {
"errors": {
"doc_count": 2
},
"info": {
"doc_count": 2
}
}
}
}
}
但我的要求是获取存储桶组内具有字段值的实际文档。有什么方法可以通过过滤器更改存储桶聚合查询,以便我可以获得每个存储桶中包含值的文档?
可以使用top_hits aggregation,获取bucket组内对应的文档
{
"size": 0,
"aggs": {
"messages": {
"filters": {
"filters": {
"info": {
"match": {
"message": "Info"
}
},
"error": {
"match": {
"message": "Error"
}
}
}
},
"aggs": {
"top_filters_hits": {
"top_hits": {
"_source": {
"includes": [
"message",
"user.id"
]
}
}
}
}
}
}
}
搜索结果将是
"aggregations": {
"messages": {
"buckets": {
"error": {
"doc_count": 2,
"top_filters_hits": {
"hits": {
"total": {
"value": 2,
"relation": "eq"
},
"max_score": 1.0,
"hits": [
{
"_index": "67033379",
"_type": "_doc",
"_id": "2",
"_score": 1.0,
"_source": {
"message": "Error GET /search HTTP/1.1 200 1070000",
"user": {
"id": "test@gmail.com"
}
}
},
{
"_index": "67033379",
"_type": "_doc",
"_id": "4",
"_score": 1.0,
"_source": {
"message": "Error GET /parse HTTP/1.1 200 1070000",
"user": {
"id": "test@gmail.com"
}
}
}
]
}
}
},
"info": {
"doc_count": 2,
"top_filters_hits": {
"hits": {
"total": {
"value": 2,
"relation": "eq"
},
"max_score": 1.0,
"hits": [
{
"_index": "67033379",
"_type": "_doc",
"_id": "1",
"_score": 1.0,
"_source": {
"message": "INFO GET /search HTTP/1.1 200 1070000",
"user": {
"id": "test@gmail.com"
}
}
},
{
"_index": "67033379",
"_type": "_doc",
"_id": "3",
"_score": 1.0,
"_source": {
"message": "INFO GET /parse HTTP/1.1 200 1070000",
"user": {
"id": "test@gmail.com"
}
}
}
]
}
}
}
}
}
}