从 Elasticsearch 获取不同嵌套对象的计数
Get count distinct by nested objects from Elasticsearch
我有以下映射的索引
{
"mappings": {
"properties": {
"typed_obj": {
"type": "nested",
"properties": {
"id": {"type": "keyword"},
"type": {"type": "keyword"}
}
}
}
}
}
和文档
{"index" : {}}
{"typed_obj": [{"id": "1", "type": "one"}, {"id": "2", "type": "two"}]}
{"index" : {}}
{"typed_obj": [{"id": "1", "type": "one"}, {"id": "2", "type": "one"}]}
{"index" : {}}
{"typed_obj": [{"id": "1", "type": "one"}, {"id": "3", "type": "one"}]}
{"index" : {}}
{"typed_obj": [{"id": "1", "type": "one"}, {"id": "4", "type": "two"}]}
如何按类型对 typed_obj 进行分组并计算唯一 ID?
好像
{
"type": "one",
"count": 3
},
{
"type": "two",
"count": 2
}
我用 agg 组成查询
{
"query": {
"match_all": {}
},
"aggs": {
"obj_nested": {
"nested": {
"path": "typed_obj"
},
"aggs": {
"by_type_and_id": {
"multi_terms": {
"terms": [
{
"field": "typed_obj.type"
},
{
"field": "typed_obj.id"
}
]
}
}
}
}
},
"size": 0
}
它returns
"buckets": [
{
"key": [
"one",
"1"
],
"key_as_string": "one|1",
"doc_count": 4
},
{
"key": [
"one",
"2"
],
"key_as_string": "one|2",
"doc_count": 1
},
{
"key": [
"one",
"3"
],
"key_as_string": "one|3",
"doc_count": 1
},
{
"key": [
"two",
"2"
],
"key_as_string": "two|2",
"doc_count": 1
},
{
"key": [
"two",
"4"
],
"key_as_string": "two|4",
"doc_count": 1
}
]
在后端应用程序中,我可以按第一个元素(它是 typed_obj 类型)对键进行分组,然后检索长度,但我的问题是 - 是否可以在不从索引中获取所有 id+type 的情况下获取类型计数对 ?
您需要使用 Cardinality aggregation 来计算不同的值。
查询:
{
"query": {
"match_all": {}
},
"aggs": {
"obj_nested": {
"nested": {
"path": "typed_obj"
},
"aggs": {
"type":{
"terms": {
"field": "typed_obj.type",
"size": 10
},
"aggs": {
"id": {
"cardinality": {
"field": "typed_obj.id"
}
}
}
}
}
}
},
"size": 0
}
回应
"aggregations" : {
"obj_nested" : {
"doc_count" : 8,
"type" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "one",
"doc_count" : 6,
"id" : {
"value" : 3
}
},
{
"key" : "two",
"doc_count" : 2,
"id" : {
"value" : 2
}
}
]
}
}
}
注:
A single-value metrics aggregation that calculates an approximate
count of distinct values.
我有以下映射的索引
{
"mappings": {
"properties": {
"typed_obj": {
"type": "nested",
"properties": {
"id": {"type": "keyword"},
"type": {"type": "keyword"}
}
}
}
}
}
和文档
{"index" : {}}
{"typed_obj": [{"id": "1", "type": "one"}, {"id": "2", "type": "two"}]}
{"index" : {}}
{"typed_obj": [{"id": "1", "type": "one"}, {"id": "2", "type": "one"}]}
{"index" : {}}
{"typed_obj": [{"id": "1", "type": "one"}, {"id": "3", "type": "one"}]}
{"index" : {}}
{"typed_obj": [{"id": "1", "type": "one"}, {"id": "4", "type": "two"}]}
如何按类型对 typed_obj 进行分组并计算唯一 ID? 好像
{
"type": "one",
"count": 3
},
{
"type": "two",
"count": 2
}
我用 agg 组成查询
{
"query": {
"match_all": {}
},
"aggs": {
"obj_nested": {
"nested": {
"path": "typed_obj"
},
"aggs": {
"by_type_and_id": {
"multi_terms": {
"terms": [
{
"field": "typed_obj.type"
},
{
"field": "typed_obj.id"
}
]
}
}
}
}
},
"size": 0
}
它returns
"buckets": [
{
"key": [
"one",
"1"
],
"key_as_string": "one|1",
"doc_count": 4
},
{
"key": [
"one",
"2"
],
"key_as_string": "one|2",
"doc_count": 1
},
{
"key": [
"one",
"3"
],
"key_as_string": "one|3",
"doc_count": 1
},
{
"key": [
"two",
"2"
],
"key_as_string": "two|2",
"doc_count": 1
},
{
"key": [
"two",
"4"
],
"key_as_string": "two|4",
"doc_count": 1
}
]
在后端应用程序中,我可以按第一个元素(它是 typed_obj 类型)对键进行分组,然后检索长度,但我的问题是 - 是否可以在不从索引中获取所有 id+type 的情况下获取类型计数对 ?
您需要使用 Cardinality aggregation 来计算不同的值。
查询:
{
"query": {
"match_all": {}
},
"aggs": {
"obj_nested": {
"nested": {
"path": "typed_obj"
},
"aggs": {
"type":{
"terms": {
"field": "typed_obj.type",
"size": 10
},
"aggs": {
"id": {
"cardinality": {
"field": "typed_obj.id"
}
}
}
}
}
}
},
"size": 0
}
回应
"aggregations" : {
"obj_nested" : {
"doc_count" : 8,
"type" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "one",
"doc_count" : 6,
"id" : {
"value" : 3
}
},
{
"key" : "two",
"doc_count" : 2,
"id" : {
"value" : 2
}
}
]
}
}
}
注:
A single-value metrics aggregation that calculates an approximate count of distinct values.