elasticsearch 基于属性嵌套聚合并根据每个存储桶中的公式获取自定义值
elasticsearch nested aggregate based on attribute and get custom value based on a formula in every bucket
我有事件操作的数据集:
{"person" : "person1", "event" : "e1", "action" : "like"}
{"person" : "person2", "event" : "e1", "action" : "dislike"}
{"person" : "person1", "event" : "e1", "action" : "share"}
{"person" : "person1", "event" : "e1", "action" : "rating"}
{"person" : "person1", "event" : "e2", "action" : "rating"}
我能否先基于事件然后基于存储桶进行聚合,基于操作的加权指标从存储桶中获取单个自定义值?
我已经做了嵌套聚合:
{
"size": 0,
"aggs": {
"all_events": {
"terms": {
"field": "event.keyword"
},
"aggs": {
"overall_ratings": {
"terms": {
"field": "action.keyword"
}
}
}
}
}
}
所以我得到了结果:
- e1 -> 喜欢 - 10,不喜欢:4,分享:8
- e2 -> 喜欢 - 30,不喜欢:0,分享:2
但我想应用一些公式来得到
- e1 -> (喜欢*5) + (不喜欢* -3) + (分享*2) = (10*5)+(4*-3)+(8*2) =
50-12+16= 54
我要:
- e1 -> 54
- e2 -> 154
是的,您可以使用聚合构建非常复杂的公式。使用 Scripted Metric aggregations.
在您的示例中 - 使用您提供的数据 - 结果应该是:
e1 -> (1*5) + (1*-3) + (1*2) = 5 - 3 + 2 = 4
e2 -> 0
聚合查询必须是:
{
"size": 0,
"aggs": {
"all_events": {
"terms": {
"field": "event.keyword"
},
"aggs": {
"overall_ratings": {
"scripted_metric": {
"init_script": "params._agg.transactions = [];",
"map_script": "if (doc.action.value == 'like') params._agg.transactions.add(5); if (doc.action.value == 'dislike') params._agg.transactions.add(-3); if (doc.action.value == 'share') params._agg.transactions.add(2);",
"combine_script" : "int total = 0; for (t in params._agg.transactions) { total += t; } return total;",
"reduce_script" : "int total = 0; for (a in params._aggs) { total += a; } return total;"
}
}
}
}
}
}
这个查询给了我以下结果:
{
"took": 4,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 5,
"max_score": 0,
"hits": []
},
"aggregations": {
"all_events": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "e1",
"doc_count": 4,
"overall_ratings": {
"value": 4
}
},
{
"key": "e2",
"doc_count": 1,
"overall_ratings": {
"value": 0
}
}
]
}
}
}
一件重要的事情 - 可能需要在 "fielddata": true
[=32= 的映射中设置 action
字段]:
"action": {
"type": "text",
...,
"fielddata": true
}
否则你会得到异常 "Fielddata is disabled on text fields by default...."
我有事件操作的数据集:
{"person" : "person1", "event" : "e1", "action" : "like"}
{"person" : "person2", "event" : "e1", "action" : "dislike"}
{"person" : "person1", "event" : "e1", "action" : "share"}
{"person" : "person1", "event" : "e1", "action" : "rating"}
{"person" : "person1", "event" : "e2", "action" : "rating"}
我能否先基于事件然后基于存储桶进行聚合,基于操作的加权指标从存储桶中获取单个自定义值?
我已经做了嵌套聚合:
{
"size": 0,
"aggs": {
"all_events": {
"terms": {
"field": "event.keyword"
},
"aggs": {
"overall_ratings": {
"terms": {
"field": "action.keyword"
}
}
}
}
}
}
所以我得到了结果:
- e1 -> 喜欢 - 10,不喜欢:4,分享:8
- e2 -> 喜欢 - 30,不喜欢:0,分享:2
但我想应用一些公式来得到
- e1 -> (喜欢*5) + (不喜欢* -3) + (分享*2) = (10*5)+(4*-3)+(8*2) = 50-12+16= 54
我要:
- e1 -> 54
- e2 -> 154
是的,您可以使用聚合构建非常复杂的公式。使用 Scripted Metric aggregations.
在您的示例中 - 使用您提供的数据 - 结果应该是:
e1 -> (1*5) + (1*-3) + (1*2) = 5 - 3 + 2 = 4
e2 -> 0
聚合查询必须是:
{
"size": 0,
"aggs": {
"all_events": {
"terms": {
"field": "event.keyword"
},
"aggs": {
"overall_ratings": {
"scripted_metric": {
"init_script": "params._agg.transactions = [];",
"map_script": "if (doc.action.value == 'like') params._agg.transactions.add(5); if (doc.action.value == 'dislike') params._agg.transactions.add(-3); if (doc.action.value == 'share') params._agg.transactions.add(2);",
"combine_script" : "int total = 0; for (t in params._agg.transactions) { total += t; } return total;",
"reduce_script" : "int total = 0; for (a in params._aggs) { total += a; } return total;"
}
}
}
}
}
}
这个查询给了我以下结果:
{
"took": 4,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 5,
"max_score": 0,
"hits": []
},
"aggregations": {
"all_events": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "e1",
"doc_count": 4,
"overall_ratings": {
"value": 4
}
},
{
"key": "e2",
"doc_count": 1,
"overall_ratings": {
"value": 0
}
}
]
}
}
}
一件重要的事情 - 可能需要在 "fielddata": true
[=32= 的映射中设置 action
字段]:
"action": {
"type": "text",
...,
"fielddata": true
}
否则你会得到异常 "Fielddata is disabled on text fields by default...."