ElasticSearch 2.1.0 - 具有 'sum' 指标的深度 'children' 聚合返回空结果
ElasticSearch 2.1.0 - Deep 'children' aggregation with 'sum' metric returning empty results
我有两层深的文档类型层次结构。这些文档按 parent-child relationships 关联如下: category > sub_category > item 即每个 sub_category 都有一个 _parent
字段引用 类别 id,并且每个 item 有一个 _parent
字段引用 sub_category id.
每个 项 都有一个 price
字段。给定类别查询,其中包括子类别和项目的条件,我想计算每个 sub_category.
的总价
我的查询看起来像这样:
{
"query": {
"has_child": {
"child_type": "sub_category",
"query": {
"has_child": {
"child_type": "item",
"query": {
"range": {
"price": {
"gte": 100,
"lte": 150
}
}
}
}
}
}
}
}
我计算每个子类别价格的聚合如下所示:
{
"aggs": {
"categories": {
"terms": {
"field": "id"
},
"aggs": {
"sub_categories": {
"children": {
"type": "sub_category"
},
"aggs": {
"sub_category_ids": {
"terms": {
"field": "id"
},
"aggs": {
"items": {
"children": {
"type": "item"
},
"aggs": {
"price": {
"sum": {
"field": "price"
}
}
}
}
}
}
}
}
}
}
}
}
尽管查询响应列出了匹配结果,但聚合响应不匹配任何项目:
{
"aggregations": {
"categories": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "category1",
"doc_count": 1,
"sub_categories": {
"doc_count": 3,
"sub_category_ids": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "subcat1",
"doc_count": 1,
"items": {
"doc_count": 0,
"price": {
"value": 0
}
}
},
{
"key": "subcat2",
"doc_count": 1,
"items": {
"doc_count": 0,
"price": {
"value": 0
}
}
},
{
"key": "subcat3",
"doc_count": 1,
"items": {
"doc_count": 0,
"price": {
"value": 0
}
}
}
]
}
}
}]
}
}
}
但是,省略 sub_category_ids
聚合确实会导致项目出现,并且价格会在 categories
聚合级别进行汇总。我希望包括 sub_category_ids
聚合只是改变价格总和的水平。
我是否误解了聚合的评估方式,如果是,我该如何修改它以显示每个子类别的总价格?
我打开了一个关于 children aggregation
的问题 #15413,因为我和其他人在 ES 2.0
中面临着类似的问题
显然,根据 ES 开发人员@martijnvg 的说法,问题在于
The children agg makes an assumption (that all segments are being seen by children agg) that was true in 1.x but not in 2.x
PR #15457 修复了这个问题,再次来自@martijnvg
Before we only evaluated segments that yielded matches in parent aggs, which caused us to miss to evaluate child docs in segments we didn't have parent matches for.
The fix for this is stop remember in what segments we have matches for
and simply evaluate all segments. This makes the code simpler and we
can still quickly see if a segment doesn't hold child docs like we did
before
此拉取请求已被合并,并且还回移植到2.x, 2.1 and 2.0 branches
。
我有两层深的文档类型层次结构。这些文档按 parent-child relationships 关联如下: category > sub_category > item 即每个 sub_category 都有一个 _parent
字段引用 类别 id,并且每个 item 有一个 _parent
字段引用 sub_category id.
每个 项 都有一个 price
字段。给定类别查询,其中包括子类别和项目的条件,我想计算每个 sub_category.
我的查询看起来像这样:
{
"query": {
"has_child": {
"child_type": "sub_category",
"query": {
"has_child": {
"child_type": "item",
"query": {
"range": {
"price": {
"gte": 100,
"lte": 150
}
}
}
}
}
}
}
}
我计算每个子类别价格的聚合如下所示:
{
"aggs": {
"categories": {
"terms": {
"field": "id"
},
"aggs": {
"sub_categories": {
"children": {
"type": "sub_category"
},
"aggs": {
"sub_category_ids": {
"terms": {
"field": "id"
},
"aggs": {
"items": {
"children": {
"type": "item"
},
"aggs": {
"price": {
"sum": {
"field": "price"
}
}
}
}
}
}
}
}
}
}
}
}
尽管查询响应列出了匹配结果,但聚合响应不匹配任何项目:
{
"aggregations": {
"categories": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "category1",
"doc_count": 1,
"sub_categories": {
"doc_count": 3,
"sub_category_ids": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "subcat1",
"doc_count": 1,
"items": {
"doc_count": 0,
"price": {
"value": 0
}
}
},
{
"key": "subcat2",
"doc_count": 1,
"items": {
"doc_count": 0,
"price": {
"value": 0
}
}
},
{
"key": "subcat3",
"doc_count": 1,
"items": {
"doc_count": 0,
"price": {
"value": 0
}
}
}
]
}
}
}]
}
}
}
但是,省略 sub_category_ids
聚合确实会导致项目出现,并且价格会在 categories
聚合级别进行汇总。我希望包括 sub_category_ids
聚合只是改变价格总和的水平。
我是否误解了聚合的评估方式,如果是,我该如何修改它以显示每个子类别的总价格?
我打开了一个关于 children aggregation
的问题 #15413,因为我和其他人在 ES 2.0
显然,根据 ES 开发人员@martijnvg 的说法,问题在于
The children agg makes an assumption (that all segments are being seen by children agg) that was true in 1.x but not in 2.x
PR #15457 修复了这个问题,再次来自@martijnvg
Before we only evaluated segments that yielded matches in parent aggs, which caused us to miss to evaluate child docs in segments we didn't have parent matches for.
The fix for this is stop remember in what segments we have matches for and simply evaluate all segments. This makes the code simpler and we can still quickly see if a segment doesn't hold child docs like we did before
此拉取请求已被合并,并且还回移植到2.x, 2.1 and 2.0 branches
。