如何在 Elasticsearch Bucket 聚合查询中获取文档值而不是文档计数

Question

我的 ES 索引中有四个文档。

       {
            "_index": "my-index",
            "_type": "_doc",
            "_id": "1",
            "_score": 1.0,
            "_source": {
                "@timestamp": "2099-11-15T13:12:00",
                "message": "INFO GET /search HTTP/1.1 200 1070000",
                "user": {
                    "id": "test@gmail.com"
                }
            }
        },
        {
            "_index": "my-index",
            "_type": "_doc",
            "_id": "2",
            "_score": 1.0,
            "_source": {
                "@timestamp": "2099-11-15T13:15:00",
                "message": "Error GET /search HTTP/1.1 200 1070000",
                "user": {
                    "id": "test@gmail.com"
                }
            }
        },
       {
            "_index": "my-index",
            "_type": "_doc",
            "_id": "3",
            "_score": 1.0,
            "_source": {
                "@timestamp": "2099-11-15T13:20:00",
                "message": "INFO GET /parse HTTP/1.1 200 1070000",
                "user": {
                    "id": "test@gmail.com"
                }
            }
        },
        {
            "_index": "my-index",
            "_type": "_doc",
            "_id": "4",
            "_score": 1.0,
            "_source": {
                "@timestamp": "2099-11-15T13:26:00",
                "message": "Error GET /parse HTTP/1.1 200 1070000",
                "user": {
                    "id": "test@gmail.com"
                }
            }
        }

我正在使用过滤器编写存储桶聚合查询，以按消息类型（信息或错误）对索引中的所有文档进行分组。在我上面的示例中，索引中有 4 个文档，两个具有“信息”类型的消息，两个具有“错误”类型的消息。

我想编写桶聚合查询，以便我可以按消息类型获取结果组。预期结果应该是两个桶，每个桶有两个文档。但是我的查询只返回每个存储桶的文档计数而不是实际的文档值。

我使用的查询是：

 {
   "size":0,
   "aggs" : {
     "messages" : {
       "filters" : {
          "filters" : {
             "info" :   { "match" : { "message" : "Info"   }},
             "error" : { "match" : { "message" : "Error"   }}
          }
        }
     }
  }
}

上述查询的输出是：

       {
"took": 3,
"timed_out": false,
"_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
},
"hits": {
    "total": {
        "value": 2,
        "relation": "eq"
    },
    "max_score": null,
    "hits": []
},
"aggregations": {
    "messages": {
        "buckets": {
            "errors": {
                "doc_count": 2
            },
            "info": {
                "doc_count": 2
            }
        }
    }
}
   }

但我的要求是获取存储桶组内具有字段值的实际文档。有什么方法可以通过过滤器更改存储桶聚合查询，以便我可以获得每个存储桶中包含值的文档？

Answer 1

可以使用top_hits aggregation,获取bucket组内对应的文档

{
  "size": 0,
  "aggs": {
    "messages": {
      "filters": {
        "filters": {
          "info": {
            "match": {
              "message": "Info"
            }
          },
          "error": {
            "match": {
              "message": "Error"
            }
          }
        }
      },
      "aggs": {
        "top_filters_hits": {
          "top_hits": {
            "_source": {
              "includes": [
                "message",
                "user.id"
              ]
            }
          }
        }
      }
    }
  }
}

搜索结果将是

"aggregations": {
    "messages": {
      "buckets": {
        "error": {
          "doc_count": 2,
          "top_filters_hits": {
            "hits": {
              "total": {
                "value": 2,
                "relation": "eq"
              },
              "max_score": 1.0,
              "hits": [
                {
                  "_index": "67033379",
                  "_type": "_doc",
                  "_id": "2",
                  "_score": 1.0,
                  "_source": {
                    "message": "Error GET /search HTTP/1.1 200 1070000",
                    "user": {
                      "id": "test@gmail.com"
                    }
                  }
                },
                {
                  "_index": "67033379",
                  "_type": "_doc",
                  "_id": "4",
                  "_score": 1.0,
                  "_source": {
                    "message": "Error GET /parse HTTP/1.1 200 1070000",
                    "user": {
                      "id": "test@gmail.com"
                    }
                  }
                }
              ]
            }
          }
        },
        "info": {
          "doc_count": 2,
          "top_filters_hits": {
            "hits": {
              "total": {
                "value": 2,
                "relation": "eq"
              },
              "max_score": 1.0,
              "hits": [
                {
                  "_index": "67033379",
                  "_type": "_doc",
                  "_id": "1",
                  "_score": 1.0,
                  "_source": {
                    "message": "INFO GET /search HTTP/1.1 200 1070000",
                    "user": {
                      "id": "test@gmail.com"
                    }
                  }
                },
                {
                  "_index": "67033379",
                  "_type": "_doc",
                  "_id": "3",
                  "_score": 1.0,
                  "_source": {
                    "message": "INFO GET /parse HTTP/1.1 200 1070000",
                    "user": {
                      "id": "test@gmail.com"
                    }
                  }
                }
              ]
            }
          }
        }
      }
    }
  }

如何在 Elasticsearch Bucket 聚合查询中获取文档值而不是文档计数

How to get doc value in Elasticsearch Bucket Aggregation query instead of doc count

elasticsearch

elasticsearch-dsl

elasticsearch-aggregation

elasticsearch-5