Elasticsearch:数周的聚合 min_doc_count 不起作用
Elasticsearch: aggregation min_doc_count for weeks doesn't work
我对 interval=week
和 min_doc_count=0
进行了以下聚合
{
"aggs": {
"scores_by_date": {
"date_histogram": {
"field": "date",
"format": "yyyy-MM-dd",
"interval": "week",
"min_doc_count": 0
}
}
}
和日期过滤器从 Jan-01-2015
到 Feb-23-2015
{
"range": {
"document.date": {
"from": "2015-01-01",
"to": "2015-02-23"
}
}
}
即使 return 桶是空的,我预计 Elasticsearch 也能填满七个星期,但最终只包含一个项目
{
"aggregations": {
"scores_by_date": {
"buckets": [
{
"key_as_string": "2015-01-05",
"key": 1420416000000,
"doc_count": 5
}
]
}
}
}
Elasticsearch version: 1.4.0
我的聚合有什么问题,或者我如何使用 Elasticsearch 来填补缺失的周数?
您可以尝试指定扩展边界(histogram aggregations 的官方文档页面上有讨论此功能的文档)。这些文档中最相关的部分是:
With extended_bounds setting, you now can "force" the histogram aggregation to start building buckets on a specific min values and also keep on building buckets up to a max value (even if there are no documents anymore). Using extended_bounds only makes sense when min_doc_count is 0 (the empty buckets will never be returned if min_doc_count is greater than 0).
因此您的聚合可能必须看起来像这样才能强制 ES return 清空该范围内的桶:
{
"aggs": {
"scores_by_date": {
"date_histogram": {
"field": "date",
"format": "yyyy-MM-dd",
"interval": "week",
"min_doc_count": 0,
"extended_bounds" : {
"min" : "2015-01-01",
"max" : "2015-02-23"
}
}
}
}
我对 interval=week
和 min_doc_count=0
{
"aggs": {
"scores_by_date": {
"date_histogram": {
"field": "date",
"format": "yyyy-MM-dd",
"interval": "week",
"min_doc_count": 0
}
}
}
和日期过滤器从 Jan-01-2015
到 Feb-23-2015
{
"range": {
"document.date": {
"from": "2015-01-01",
"to": "2015-02-23"
}
}
}
即使 return 桶是空的,我预计 Elasticsearch 也能填满七个星期,但最终只包含一个项目
{
"aggregations": {
"scores_by_date": {
"buckets": [
{
"key_as_string": "2015-01-05",
"key": 1420416000000,
"doc_count": 5
}
]
}
}
}
Elasticsearch version: 1.4.0
我的聚合有什么问题,或者我如何使用 Elasticsearch 来填补缺失的周数?
您可以尝试指定扩展边界(histogram aggregations 的官方文档页面上有讨论此功能的文档)。这些文档中最相关的部分是:
With extended_bounds setting, you now can "force" the histogram aggregation to start building buckets on a specific min values and also keep on building buckets up to a max value (even if there are no documents anymore). Using extended_bounds only makes sense when min_doc_count is 0 (the empty buckets will never be returned if min_doc_count is greater than 0).
因此您的聚合可能必须看起来像这样才能强制 ES return 清空该范围内的桶:
{
"aggs": {
"scores_by_date": {
"date_histogram": {
"field": "date",
"format": "yyyy-MM-dd",
"interval": "week",
"min_doc_count": 0,
"extended_bounds" : {
"min" : "2015-01-01",
"max" : "2015-02-23"
}
}
}
}