SQL elasticsearch对应的聚合查询
SQL aggregation query corresponding in elasticsearch
我研究了elasticsearch聚合查询,但没找到它是否支持多个聚合函数。换句话说,我想知道 elasticsearch 是否可以生成这个 Sql 聚合查询的等价物:
SELECT account_no, transaction_type, count(account_no), sum(amount), max(amount) FROM index_name GROUP BY account_no, transaction_type Having count(account_no) > 10
如果是,怎么样?
谢谢。
有两种可能的方式 来完成您在 ES 中寻找的事情,我在下面都提到了它们。
我还添加了示例映射和示例文档供您参考。
映射:
PUT index_name
{
"mappings": {
"mydocs":{
"properties":{
"account_no":{
"type": "keyword"
},
"transaction_type":{
"type": "keyword"
},
"amount":{
"type":"double"
}
}
}
}
}
示例文档:
请注意,我只为 1 位客户创建了 4 笔交易的列表。
POST index_name/mydocs/1
{
"account_no": "1011",
"transaction_type":"credit",
"amount": 200
}
POST index_name/mydocs/2
{
"account_no": "1011",
"transaction_type":"credit",
"amount": 400
}
POST index_name/mydocs/3
{
"account_no": "1011",
"transaction_type":"cheque",
"amount": 100
}
POST index_name/mydocs/4
{
"account_no": "1011",
"transaction_type":"cheque",
"amount": 100
}
有两种方法可以得到你要找的东西:
解决方案 1:使用 Elasticsearch 查询 DSL
聚合查询:
对于聚合查询 DSL,我使用了以下聚合查询来解决您正在寻找的问题。
- Terms Aggregation
- Sum Aggregation Query (Metric Aggregation)
- Max Aggregation Query (Metric Aggregation)
下面是查询的汇总版本,以便您清楚地了解哪些查询是 sibling 哪些是 parents .
- Terms Aggregation (For Every Account)
- Terms Aggregation (For Every Transaction_type)
- Sum Amount
- Max Amount
下面是实际查询:
POST index_name/_search
{
"size": 0,
"aggs": {
"account_no_agg": {
"terms": {
"field": "account_no"
},
"aggs": {
"transaction_type_agg": {
"terms": {
"field": "transaction_type",
"min_doc_count": 2
},
"aggs": {
"sum_amount": {
"sum": {
"field": "amount"
}
},
"max_amount":{
"max": {
"field": "amount"
}
}
}
}
}
}
}
}
值得一提的重要事情是 min_doc_count
,它只不过是 having count(account_no)>10
,在我的查询中,我只过滤那些带有 having count(account_no) > 2
查询响应
{
"took" : 5,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 4,
"max_score" : 0.0,
"hits" : [ ]
},
"aggregations" : {
"account_no_agg" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "1011", <---- account_no
"doc_count" : 4, <---- count(account_no)
"transaction_type_agg" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "cheque", <---- transaction_type
"doc_count" : 2,
"sum_amount" : { <---- sum(amount)
"value" : 200.0
},
"max_amount" : { <---- max(amount)
"value" : 100.0
}
},
{
"key" : "credit", <---- another transaction_type
"doc_count" : 2,
"sum_amount" : { <---- sum(amount)
"value" : 600.0
},
"max_amount" : { <---- max(amount)
"value" : 400.0
}
}
]
}
}
]
}
}
}
仔细注意上面的结果,我在需要的地方添加了注释,这样它可以帮助您查找 sql 查询的哪一部分。
方案二:使用ElasticsearchSQL(_xpack方案)
如果您正在使用 Elasticsearch 的 SQL Access 的 xpack 功能,您可以简单地复制粘贴 SELECT Query 以获取上述映射和文档:
Elasticsearch SQL:
POST /_xpack/sql?format=txt
{
"query": "SELECT account_no, transaction_type, sum(amount), max(amount), count(account_no) FROM index_name GROUP BY account_no, transaction_type HAVING count(account_no) > 1"
}
Elasticsearch SQL 结果:
account_no |transaction_type| SUM(amount) | MAX(amount) |COUNT(account_no)
---------------+----------------+---------------+---------------+-----------------
1011 |cheque |200.0 |100.0 |2
1011 |credit |600.0 |400.0 |2
请注意,我已经在 ES 6.5.4 中测试了查询。
希望对您有所帮助!
我研究了elasticsearch聚合查询,但没找到它是否支持多个聚合函数。换句话说,我想知道 elasticsearch 是否可以生成这个 Sql 聚合查询的等价物:
SELECT account_no, transaction_type, count(account_no), sum(amount), max(amount) FROM index_name GROUP BY account_no, transaction_type Having count(account_no) > 10
如果是,怎么样? 谢谢。
有两种可能的方式 来完成您在 ES 中寻找的事情,我在下面都提到了它们。
我还添加了示例映射和示例文档供您参考。
映射:
PUT index_name
{
"mappings": {
"mydocs":{
"properties":{
"account_no":{
"type": "keyword"
},
"transaction_type":{
"type": "keyword"
},
"amount":{
"type":"double"
}
}
}
}
}
示例文档:
请注意,我只为 1 位客户创建了 4 笔交易的列表。
POST index_name/mydocs/1
{
"account_no": "1011",
"transaction_type":"credit",
"amount": 200
}
POST index_name/mydocs/2
{
"account_no": "1011",
"transaction_type":"credit",
"amount": 400
}
POST index_name/mydocs/3
{
"account_no": "1011",
"transaction_type":"cheque",
"amount": 100
}
POST index_name/mydocs/4
{
"account_no": "1011",
"transaction_type":"cheque",
"amount": 100
}
有两种方法可以得到你要找的东西:
解决方案 1:使用 Elasticsearch 查询 DSL
聚合查询:
对于聚合查询 DSL,我使用了以下聚合查询来解决您正在寻找的问题。
- Terms Aggregation
- Sum Aggregation Query (Metric Aggregation)
- Max Aggregation Query (Metric Aggregation)
下面是查询的汇总版本,以便您清楚地了解哪些查询是 sibling 哪些是 parents .
- Terms Aggregation (For Every Account)
- Terms Aggregation (For Every Transaction_type)
- Sum Amount
- Max Amount
下面是实际查询:
POST index_name/_search
{
"size": 0,
"aggs": {
"account_no_agg": {
"terms": {
"field": "account_no"
},
"aggs": {
"transaction_type_agg": {
"terms": {
"field": "transaction_type",
"min_doc_count": 2
},
"aggs": {
"sum_amount": {
"sum": {
"field": "amount"
}
},
"max_amount":{
"max": {
"field": "amount"
}
}
}
}
}
}
}
}
值得一提的重要事情是 min_doc_count
,它只不过是 having count(account_no)>10
,在我的查询中,我只过滤那些带有 having count(account_no) > 2
查询响应
{
"took" : 5,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 4,
"max_score" : 0.0,
"hits" : [ ]
},
"aggregations" : {
"account_no_agg" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "1011", <---- account_no
"doc_count" : 4, <---- count(account_no)
"transaction_type_agg" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "cheque", <---- transaction_type
"doc_count" : 2,
"sum_amount" : { <---- sum(amount)
"value" : 200.0
},
"max_amount" : { <---- max(amount)
"value" : 100.0
}
},
{
"key" : "credit", <---- another transaction_type
"doc_count" : 2,
"sum_amount" : { <---- sum(amount)
"value" : 600.0
},
"max_amount" : { <---- max(amount)
"value" : 400.0
}
}
]
}
}
]
}
}
}
仔细注意上面的结果,我在需要的地方添加了注释,这样它可以帮助您查找 sql 查询的哪一部分。
方案二:使用ElasticsearchSQL(_xpack方案)
如果您正在使用 Elasticsearch 的 SQL Access 的 xpack 功能,您可以简单地复制粘贴 SELECT Query 以获取上述映射和文档:
Elasticsearch SQL:
POST /_xpack/sql?format=txt
{
"query": "SELECT account_no, transaction_type, sum(amount), max(amount), count(account_no) FROM index_name GROUP BY account_no, transaction_type HAVING count(account_no) > 1"
}
Elasticsearch SQL 结果:
account_no |transaction_type| SUM(amount) | MAX(amount) |COUNT(account_no)
---------------+----------------+---------------+---------------+-----------------
1011 |cheque |200.0 |100.0 |2
1011 |credit |600.0 |400.0 |2
请注意,我已经在 ES 6.5.4 中测试了查询。
希望对您有所帮助!