Elasticsearch:过滤器聚合的准确性

Elasticsearch: accuracy on a filter aggregation

我是 Elasticsearch 的新手(使用 2.2 版)。 为了简化我的问题,我的文档中有一个名为 termination 的字段,有时可以取值 transfer.

我目前执行此请求以按月汇总具有该终止的文档数量:

{
  "size": 0,
  "sort": [{
    "@timestamp": {
      "order": "desc",
      "unmapped_type": "boolean"
    }
  }],
  "query": { "match_all": {} },
  "aggs": {
    "report": {
      "date_histogram": {
        "field": "@timestamp",
        "interval": "month",
        "min_doc_count": 0
      },
      "aggs": {
        "documents_with_termination_transfer": {
          "filter": {
            "term": {
              "termination": "transfer"
            }
          }
        }
      }
    }
  }
}

这是回复:

{
    "_shards": {
        "failed": 0, 
        "successful": 206, 
        "total": 206
    }, 
    "aggregations": {
        "report": {
            "buckets": [
                {
                    "calls_with_termination_transfer": {
                        "doc_count": 209163
                    }, 
                    "doc_count": 278100, 
                    "key": 1451606400000, 
                    "key_as_string": "2016-01-01T00:00:00.000Z"
                }, 
                {
                    "calls_with_termination_transfer": {
                        "doc_count": 107244
                    }, 
                    "doc_count": 136597, 
                    "key": 1454284800000, 
                    "key_as_string": "2016-02-01T00:00:00.000Z"
                }
            ]
        }
    }, 
    "hits": {
        "hits": [], 
        "max_score": 0.0, 
        "total": 414699
    }, 
    "timed_out": false, 
    "took": 90
}

为什么点击数 (414699) 大于文档计数总数 (278100 + 136597 = 414697)?我读过有关准确性问题的信息,但它似乎不适用于过滤器... 如果我将文档总数与 transfer 终止相加,是否还会存在准确性问题?

我的猜测是某些文档缺少 @timestamp

您可以在此字段中通过 运行 exists query 进行验证。