SQL elasticsearch对应的聚合查询

SQL aggregation query corresponding in elasticsearch

我研究了elasticsearch聚合查询,但没找到它是否支持多个聚合函数。换句话说,我想知道 elasticsearch 是否可以生成这个 Sql 聚合查询的等价物:

  SELECT account_no, transaction_type, count(account_no), sum(amount), max(amount) FROM index_name GROUP BY account_no, transaction_type Having count(account_no) > 10

如果是,怎么样? 谢谢。

两种可能的方式 来完成您在 ES 中寻找的事情,我在下面都提到了它们。

我还添加了示例映射和示例文档供您参考。

映射:

PUT index_name
{
  "mappings": {
    "mydocs":{
      "properties":{
        "account_no":{
          "type": "keyword"
        },
        "transaction_type":{
          "type": "keyword"
        },
        "amount":{
          "type":"double"
        }
      }
    }
  }
}

示例文档:

请注意,我只为 1 位客户创建了 4 笔交易的列表。

POST index_name/mydocs/1
{
  "account_no": "1011",
  "transaction_type":"credit",
  "amount": 200
}

POST index_name/mydocs/2
{
  "account_no": "1011",
  "transaction_type":"credit",
  "amount": 400
}

POST index_name/mydocs/3
{
  "account_no": "1011",
  "transaction_type":"cheque",
  "amount": 100
}

POST index_name/mydocs/4
{
  "account_no": "1011",
  "transaction_type":"cheque",
  "amount": 100
}

有两种方法可以得到你要找的东西:

解决方案 1:使用 Elasticsearch 查询 DSL

聚合查询:

对于聚合查询 DSL,我使用了以下聚合查询来解决您正在寻找的问题。

下面是查询的汇总版本,以便您清楚地了解哪些查询是 sibling 哪些是 parents .

- Terms Aggregation (For Every Account)
  - Terms Aggregation (For Every Transaction_type)
    - Sum Amount 
    - Max Amount

下面是实际查询:

POST index_name/_search
{
  "size": 0, 
  "aggs": {
    "account_no_agg": {
      "terms": {
        "field": "account_no"
      },
      "aggs": {
        "transaction_type_agg": {
          "terms": {
            "field": "transaction_type",
            "min_doc_count": 2
          },
          "aggs": {
            "sum_amount": {
              "sum": {
                "field": "amount"
              }
            },
            "max_amount":{
              "max": {
                "field": "amount"
              }
            }
          }
        }
      }
    }
  }
}

值得一提的重要事情是 min_doc_count,它只不过是 having count(account_no)>10,在我的查询中,我只过滤那些带有 having count(account_no) > 2

查询响应

{
  "took" : 5,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 4,
    "max_score" : 0.0,
    "hits" : [ ]
  },
  "aggregations" : {
    "account_no_agg" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "1011",                         <----  account_no
          "doc_count" : 4,                        <----  count(account_no)
          "transaction_type_agg" : {
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets" : [
              {
                "key" : "cheque",                 <---- transaction_type
                "doc_count" : 2,
                "sum_amount" : {                  <----  sum(amount)
                  "value" : 200.0
                },
                "max_amount" : {                  <----  max(amount)
                  "value" : 100.0
                }
              },
              {
                "key" : "credit",                 <---- another transaction_type
                "doc_count" : 2,
                "sum_amount" : {                  <---- sum(amount)
                  "value" : 600.0
                },
                "max_amount" : {                  <---- max(amount)
                  "value" : 400.0
                }
              }
            ]
          }
        }
      ]
    }
  }
}

仔细注意上面的结果,我在需要的地方添加了注释,这样它可以帮助您查找 sql 查询的哪一部分。

方案二:使用ElasticsearchSQL(_xpack方案)

如果您正在使用 Elasticsearch 的 SQL Access 的 xpack 功能,您可以简单地复制粘贴 SELECT Query 以获取上述映射和文档:

Elasticsearch SQL:

POST /_xpack/sql?format=txt
{
  "query": "SELECT account_no, transaction_type, sum(amount), max(amount), count(account_no) FROM index_name GROUP BY account_no, transaction_type HAVING count(account_no) > 1"

}

Elasticsearch SQL 结果:

  account_no   |transaction_type|  SUM(amount)  |  MAX(amount)  |COUNT(account_no)
---------------+----------------+---------------+---------------+-----------------
1011           |cheque          |200.0          |100.0          |2                
1011           |credit          |600.0          |400.0          |2                

请注意,我已经在 ES 6.5.4 中测试了查询。

希望对您有所帮助!