Elasticsearch 平均时差聚合查询

Elastic search Average time difference Aggregate Query

我在 elasticsearch 中有文档,其中每个文档如下所示:

{
  "id": "T12890ADSA12",
  "status": "ENDED",
  "type": "SAMPLE",
  "updatedAt": "2020-05-29T18:18:08.483Z",
  "events": [
    {
      "event": "STARTED",
      "version": 1,
      "timestamp": "2020-04-30T13:41:25.862Z"
    },
    {
      "event": "INPROGRESS",
      "version": 2,
      "timestamp": "2020-05-14T17:03:09.137Z"
    },
    {
      "event": "INPROGRESS",
      "version": 3,
      "timestamp": "2020-05-17T17:03:09.137Z"
    },
    {
      "event": "ENDED",
      "version": 4,
      "timestamp": "2020-05-29T18:18:08.483Z"
    }
  ],
  "createdAt": "2020-04-30T13:41:25.862Z"
}

现在,我想在 elasticsearch 中编写查询以获取类型为 "SAMPLE" 的所有文档,并且我可以获得所有这些文档的开始和结束之间的平均时间。例如。平均 (2020-05-29T18:18:08.483Z - 2020-04-30T13:41:25.862Z, ....)。假设 STARTED 和 ENDED 事件在事件数组中只出现一次。有什么办法可以做到吗?

你可以这样做。查询选择类型为 SAMPLE 且状态为 ENDED 的事件(以确保存在 ENDED 事件)。然后 avg 聚合使用脚本来收集 STARTED 和 ENDED 时间戳并将它们减去 return 天数:

POST test/_search
{
  "query": {
    "bool": {
      "filter": [
        {
          "term": {
            "status.keyword": "ENDED"
          }
        },
        {
          "term": {
            "type.keyword": "SAMPLE"
          }
        }
      ]
    }
  },
  "aggs": {
    "duration": {
      "avg": {
        "script": "Map findEvent(List events, String type) {return events.find(it -> it.event == type);} def started = Instant.parse(findEvent(params._source.events, 'STARTED').timestamp); def ended = Instant.parse(findEvent(params._source.events, 'ENDED').timestamp); return ChronoUnit.DAYS.between(started, ended);"
      }
    }
  }
}

脚本如下所示:

Map findEvent(List events, String type) {
  return events.find(it -> it.event == type);
}
def started = Instant.parse(findEvent(params._source.events, 'STARTED').timestamp);
def ended = Instant.parse(findEvent(params._source.events, 'ENDED').timestamp); 
return ChronoUnit.DAYS.between(started, ended);