如何根据层级字段值(部分字段)创建子桶聚合
How to create sub-bucket aggregation on the basis of hierarchy field value (part of the field)
假设我们有一个具有如下层次结构字段的文档:
POST subbuckets/_doc
{
"hierarchy": "this/is/some/hierarchy"
}
POST subbuckets/_doc
{
"hierarchy": "this/is/some/hierarchy2"
}
POST subbuckets/_doc
{
"hierarchy": "this/is/another/hierarchy1"
}
我想计算属于每个层级的文件数量
即
"this"
层级有 3 个文档
"this/is"
层级有 3 个文档
"this/is/some"
层级有 2 个文档
"this/is/another"
层级有 1 个文档
"this/is/another/hierarchy1"
层级有 1 个文档
"this/is/some/hierarchy"
层级有 1 个文档
"this/is/some/hierarchy2"
层级有 1 个文档
我们无法在 keyword
上应用分析器,因此为了解决此问题,我们将字段定义为 text
类型并在 text
字段上启用聚合并设置 "fielddata": true
。请检查以下配置。
索引映射:
PUT index5
{
"settings": {
"analysis": {
"analyzer": {
"path-analyzer": {
"tokenizer": "path-tokenizer"
}
},
"tokenizer": {
"path-tokenizer": {
"type": "path_hierarchy",
"delimiter": "/"
}
}
}
},
"mappings": {
"properties": {
"hierarchy": {
"type": "text",
"analyzer": "path-analyzer",
"search_analyzer": "keyword",
"fielddata": true
}
}
}
}
索引文件:
POST index5/_doc
{
"hierarchy": "this/is/some/hierarchy"
}
POST index5/_doc
{
"hierarchy": "this/is/some/hierarchy2"
}
POST index5/_doc
{
"hierarchy": "this/is/another/hierarchy1"
}
查询:
POST index5/_search
{
"aggs": {
"path": {
"terms": {
"field": "hierarchy"
}
}
},
"size": 0
}
响应:
{
"aggregations" : {
"path" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "this",
"doc_count" : 3
},
{
"key" : "this/is",
"doc_count" : 3
},
{
"key" : "this/is/some",
"doc_count" : 2
},
{
"key" : "this/is/another",
"doc_count" : 1
},
{
"key" : "this/is/another/hierarchy1",
"doc_count" : 1
},
{
"key" : "this/is/some/hierarchy",
"doc_count" : 1
},
{
"key" : "this/is/some/hierarchy2",
"doc_count" : 1
}
]
}
}
}
假设我们有一个具有如下层次结构字段的文档:
POST subbuckets/_doc
{
"hierarchy": "this/is/some/hierarchy"
}
POST subbuckets/_doc
{
"hierarchy": "this/is/some/hierarchy2"
}
POST subbuckets/_doc
{
"hierarchy": "this/is/another/hierarchy1"
}
我想计算属于每个层级的文件数量 即
"this"
层级有 3 个文档"this/is"
层级有 3 个文档"this/is/some"
层级有 2 个文档"this/is/another"
层级有 1 个文档"this/is/another/hierarchy1"
层级有 1 个文档"this/is/some/hierarchy"
层级有 1 个文档"this/is/some/hierarchy2"
层级有 1 个文档
我们无法在 keyword
上应用分析器,因此为了解决此问题,我们将字段定义为 text
类型并在 text
字段上启用聚合并设置 "fielddata": true
。请检查以下配置。
索引映射:
PUT index5
{
"settings": {
"analysis": {
"analyzer": {
"path-analyzer": {
"tokenizer": "path-tokenizer"
}
},
"tokenizer": {
"path-tokenizer": {
"type": "path_hierarchy",
"delimiter": "/"
}
}
}
},
"mappings": {
"properties": {
"hierarchy": {
"type": "text",
"analyzer": "path-analyzer",
"search_analyzer": "keyword",
"fielddata": true
}
}
}
}
索引文件:
POST index5/_doc
{
"hierarchy": "this/is/some/hierarchy"
}
POST index5/_doc
{
"hierarchy": "this/is/some/hierarchy2"
}
POST index5/_doc
{
"hierarchy": "this/is/another/hierarchy1"
}
查询:
POST index5/_search
{
"aggs": {
"path": {
"terms": {
"field": "hierarchy"
}
}
},
"size": 0
}
响应:
{
"aggregations" : {
"path" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "this",
"doc_count" : 3
},
{
"key" : "this/is",
"doc_count" : 3
},
{
"key" : "this/is/some",
"doc_count" : 2
},
{
"key" : "this/is/another",
"doc_count" : 1
},
{
"key" : "this/is/another/hierarchy1",
"doc_count" : 1
},
{
"key" : "this/is/some/hierarchy",
"doc_count" : 1
},
{
"key" : "this/is/some/hierarchy2",
"doc_count" : 1
}
]
}
}
}