每天存储桶的文档数量并应用了一些过滤器
Number of Documents Per Day bucket and applied some filters
我在 elasticsearch 中有文档,其中每个文档如下所示:
{
"id": "T12890ADSA12",
"status": “CREATED”,
"type": “ABC”,
"updatedAt": "2020-05-29T18:18:08.483Z",
"createdAt": "2020-04-30T13:41:25.862Z"
}
对于此文档结构,我想获取所有状态为 CREATED 或 SCHEDULED 且 TYPE 为 ABC 的文档。在这些过滤后的文档中,我想根据 currentDate - createdAt 在天桶中聚合文档数量。例如
- 创建日期为今天的日期 -> 今天创建的文档数
- 创建日期为昨天的日期 -> 昨天创建的文档数
过去 7 天也是如此。
有没有一种简单的方法可以在单个查询中执行此操作?
请查找以下映射、示例文档、聚合查询和响应:
映射:
PUT my_date_index
{
"mappings": {
"properties": {
"id": {
"type": "keyword"
},
"status": {
"type": "keyword"
},
"type": {
"type": "keyword"
},
"updatedAt": {
"type": "date"
},
"createdAt": {
"type": "date"
}
}
}
}
示例文档:
POST my_date_index/_doc/1
{
"id": "T12890ADSA12",
"status": "CREATED",
"type": "ABC",
"updatedAt": "2020-05-29T18:18:08.483Z",
"createdAt": "2020-07-06T05:00:00.000Z"
}
POST my_date_index/_doc/2
{
"id": "T12890ADSA13",
"status": "SCHEDULED",
"type": "ABC",
"updatedAt": "2020-05-29T18:18:08.483Z",
"createdAt": "2020-07-05T13:41:25.862Z"
}
POST my_date_index/_doc/3
{
"id": "T12890ADSA14",
"status": "SCHEDULED",
"type": "ABC",
"updatedAt": "2020-05-29T18:18:08.483Z",
"createdAt": "2020-07-04T06:00:00.000Z"
}
POST my_date_index/_doc/4
{
"id": "T12890ADSA15",
"status": "SCHEDULED",
"type": "ABC",
"updatedAt": "2020-05-29T18:18:08.483Z",
"createdAt": "2020-07-03T07:00:00.000Z"
}
查询请求:
POST my_date_index/_search
{
"size": 0, <----- Remove this to return documents too
"query": {
"bool": {
"must": [
{
"term": {
"type": "ABC"
}
},
{
"range": {
"createdAt": {
"gte": "now-7d",
"lte": "now"
}
}
}
],
"should": [
{
"term": {
"status": "SCHEDULED"
}
},
{
"term": {
"status": "CREATED"
}
}
],
"minimum_should_match": 1
}
},
"aggs": {
"my_date": {
"date_histogram": {
"field": "createdAt",
"calendar_interval": "day",
"order": {
"_key": "desc"
}
}
}
}
}
请注意,我首先根据日期和您提供的条件过滤了文档。
这将 return 所有文件。 Post 我已应用 date histogram 查询来获取该日期范围内每一天的文档。
回复:
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 4,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"my_date" : {
"buckets" : [
{
"key_as_string" : "2020-07-06T00:00:00.000Z",
"key" : 1593993600000,
"doc_count" : 1
},
{
"key_as_string" : "2020-07-05T00:00:00.000Z",
"key" : 1593907200000,
"doc_count" : 1
},
{
"key_as_string" : "2020-07-04T00:00:00.000Z",
"key" : 1593820800000,
"doc_count" : 1
},
{
"key_as_string" : "2020-07-03T00:00:00.000Z",
"key" : 1593734400000,
"doc_count" : 1
}
]
}
}
}
希望对您有所帮助!
我在 elasticsearch 中有文档,其中每个文档如下所示:
{
"id": "T12890ADSA12",
"status": “CREATED”,
"type": “ABC”,
"updatedAt": "2020-05-29T18:18:08.483Z",
"createdAt": "2020-04-30T13:41:25.862Z"
}
对于此文档结构,我想获取所有状态为 CREATED 或 SCHEDULED 且 TYPE 为 ABC 的文档。在这些过滤后的文档中,我想根据 currentDate - createdAt 在天桶中聚合文档数量。例如
- 创建日期为今天的日期 -> 今天创建的文档数
- 创建日期为昨天的日期 -> 昨天创建的文档数
过去 7 天也是如此。
有没有一种简单的方法可以在单个查询中执行此操作?
请查找以下映射、示例文档、聚合查询和响应:
映射:
PUT my_date_index
{
"mappings": {
"properties": {
"id": {
"type": "keyword"
},
"status": {
"type": "keyword"
},
"type": {
"type": "keyword"
},
"updatedAt": {
"type": "date"
},
"createdAt": {
"type": "date"
}
}
}
}
示例文档:
POST my_date_index/_doc/1
{
"id": "T12890ADSA12",
"status": "CREATED",
"type": "ABC",
"updatedAt": "2020-05-29T18:18:08.483Z",
"createdAt": "2020-07-06T05:00:00.000Z"
}
POST my_date_index/_doc/2
{
"id": "T12890ADSA13",
"status": "SCHEDULED",
"type": "ABC",
"updatedAt": "2020-05-29T18:18:08.483Z",
"createdAt": "2020-07-05T13:41:25.862Z"
}
POST my_date_index/_doc/3
{
"id": "T12890ADSA14",
"status": "SCHEDULED",
"type": "ABC",
"updatedAt": "2020-05-29T18:18:08.483Z",
"createdAt": "2020-07-04T06:00:00.000Z"
}
POST my_date_index/_doc/4
{
"id": "T12890ADSA15",
"status": "SCHEDULED",
"type": "ABC",
"updatedAt": "2020-05-29T18:18:08.483Z",
"createdAt": "2020-07-03T07:00:00.000Z"
}
查询请求:
POST my_date_index/_search
{
"size": 0, <----- Remove this to return documents too
"query": {
"bool": {
"must": [
{
"term": {
"type": "ABC"
}
},
{
"range": {
"createdAt": {
"gte": "now-7d",
"lte": "now"
}
}
}
],
"should": [
{
"term": {
"status": "SCHEDULED"
}
},
{
"term": {
"status": "CREATED"
}
}
],
"minimum_should_match": 1
}
},
"aggs": {
"my_date": {
"date_histogram": {
"field": "createdAt",
"calendar_interval": "day",
"order": {
"_key": "desc"
}
}
}
}
}
请注意,我首先根据日期和您提供的条件过滤了文档。
这将 return 所有文件。 Post 我已应用 date histogram 查询来获取该日期范围内每一天的文档。
回复:
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 4,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"my_date" : {
"buckets" : [
{
"key_as_string" : "2020-07-06T00:00:00.000Z",
"key" : 1593993600000,
"doc_count" : 1
},
{
"key_as_string" : "2020-07-05T00:00:00.000Z",
"key" : 1593907200000,
"doc_count" : 1
},
{
"key_as_string" : "2020-07-04T00:00:00.000Z",
"key" : 1593820800000,
"doc_count" : 1
},
{
"key_as_string" : "2020-07-03T00:00:00.000Z",
"key" : 1593734400000,
"doc_count" : 1
}
]
}
}
}
希望对您有所帮助!