date_histogram 的 Elasticsearch 聚合给出了错误的存储桶结果

Elasticsearch aggregation with date_histogram gives wrong result for buckets

我有带时间戳的数据。我想在上面做 date_histogram。

当我 运行 查询时 return 总计 13 这是正确的,但它在 2014-10-10 中显示了一条记录,但我在 [=13= 中找不到该记录] 我有。

curl http://localhost:9200/test/test/_search -X POST -d '{"fields":
 ["creation_time"],
  "query" :
      {"filtered":
          {"query":
              {"match":
                  {"type": "test.type"}
              }
          }
      },
  "aggs":
      {"group_by_created_by":
          {"date_histogram":
              {"field":"creation_time", "interval": "1d"}
          }
      }
 }' | python -m json.tool
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  2083  100  1733  100   350   234k  48590 --:--:-- --:--:-- --:--:--  241k
{
    "_shards": {
        "failed": 0,
        "successful": 5,
        "total": 5
    },
    "aggregations": {
        "group_by_created_at": {
            "buckets": [
                {
                    "doc_count": 12,
                    "key": 1412812800000,
                    "key_as_string": "2014-10-09T00:00:00.000Z"
                },
                {
                    "doc_count": 1,
                    "key": 1412899200000,
                    "key_as_string": "2014-10-10T00:00:00.000Z"
                }
            ]
        }
    },
    "hits": {
        "hits": [
            {
                "_id": "qk5EGDqUSoW-ckZU9bnSsA",
                "_index": "test",
                "_score": 3.730029,
                "_type": "test",
                "fields": {
                    "creation_time": [
                        "2014-10-09T16:35:39.535389"
                    ]
                }
            },
            {
                "_id": "GnglI_3xRYii_oE5q91FUg",
                "_index": "test",
                "_score": 3.6149597,
                "_type": "test",
                "fields": {
                    "creation_time": [
                        "2014-10-09T17:16:55.677919"
                    ]
                }
            },
            {
                "_id": "ELP1f_-IS8SJiT4i4Vh6_g",
                "_index": "test",
                "_score": 2.974081,
                "_type": "test",
                "fields": {
                    "creation_time": [
                        "2014-10-09T01:21:21.691270"
                    ]
                }
            },
            {
                "_id": "ySlIV4vWRvm_q0-9p87dEQ",
                "_index": "test",
                "_score": 2.974081,
                "_type": "test",
                "fields": {
                    "creation_time": [
                        "2014-10-09T01:33:51.291644"
                    ]
                }
            },
            {
                "_id": "swXVnMmJSsmNW30zeJvCoQ",
                "_index": "test",
                "_score": 2.974081,
                "_type": "test",
                "fields": {
                    "creation_time": [
                        "2014-10-09T17:08:45.738821"
                    ]
                }
            },
            {
                "_id": "h0j6L-VGTnyChSIevtt2og",
                "_index": "test",
                "_score": 2.974081,
                "_type": "test",
                "fields": {
                    "creation_time": [
                        "2014-10-09T22:35:16.908080"
                    ]
                }
            },
            {
                "_id": "ANoTEXIgRgml6gLD4YKtIg",
                "_index": "test",
                "_score": 2.9459102,
                "_type": "test",
                "fields": {
                    "creation_time": [
                        "2014-10-09T01:25:18.869175"
                    ]
                }
            },
            {
                "_id": "FSCPBsogT5OXghBUmKXidQ",
                "_index": "test",
                "_score": 2.9459102,
                "_type": "test",
                "fields": {
                    "creation_time": [
                        "2014-10-09T01:42:49.000599"
                    ]
                }
            },
            {
                "_id": "VEw6XbIySvW7h7GF7h4ynA",
                "_index": "test",
                "_score": 2.9459102,
                "_type": "test",
                "fields": {
                    "creation_time": [
                        "2014-10-09T16:45:51.563595"
                    ]
                }
            },
            {
                "_id": "J9NfffAvRPmFxtOBZ6IsCA",
                "_index": "test",
                "_score": 2.9169223,
                "_type": "test",
                "fields": {
                    "creation_time": [
                        "2014-10-09T01:23:30.546353"
                    ]
                }
            }
        ],
        "max_score": 3.730029,
        "total": 13
    },
    "timed_out": false,
    "took": 4
}

如果您看到以上示例,则 10-10 上没有记录,但聚合显示该存储桶中有一条记录。

如果计算点击次数,您会发现只有 10 个对象。这是因为,默认情况下,Elasticsearch 将 return 仅 前十个结果命中 .

但是,即使 hits 中不存在所有与查询匹配的文档,在计算聚合时也会考虑在内。

尝试将您的查询更新为:

{
  "size": 13,
  "fields": ["creation_time"],
  "query" :
      {"filtered":
          {"query":
              {"match":
                  {"type": "test.type"}
              }
          }
      },
  "aggs":
      {"group_by_created_by":
          {"date_histogram":
              {"field":"creation_time", "interval": "1d"}
          }
      }
 }

您会看到在 10-10 日创建的文档。

对所有匹配的文档进行聚合。

你没有设置size表示你默认10个文档下点击。将 size 更改为 13(+),您的 2014-10-10 文档应该会显示。

当您有更多结果时,手动检查所有结果变得不方便,您还可以使用 top_hits 作为子聚合器来获取桶中内容的峰值(有一个 size 也有选项)。