具有混合 nested/non-nested 过滤器的嵌套对象聚合术语
Nested object aggregation term with mixed nested/non-nested filter
我们有分面显示单击过滤器(并组合它们)时将显示的结果数量。像这样:
在我们介绍嵌套对象之前,以下内容可以完成这项工作:
GET /x_v1/_search/
{
"size": 0,
"aggs": {
"FilteredDescriptiveFeatures": {
"filter": {
"bool": {
"must": [
{
"terms": {
"breadcrumbs.categoryIds": [
"category"
]
}
},
{
"terms": {
"products.sterile": [
"0"
]
}
}
]
}
},
"aggs": {
"DescriptiveFeatures": {
"terms": {
"field": "products.descriptiveFeatures",
"size": 1000
}
}
}
}
}
}
这给出了结果:
"aggregations": {
"FilteredDescriptiveFeatures": {
"doc_count": 280,
"DescriptiveFeatures": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "somekey",
"doc_count": 42
},
虽然我们需要使 products
成为一个嵌套对象,但我目前正在尝试重写上面的内容以适应这一变化。
我的尝试如下所示。虽然它没有给出正确的结果,而且似乎没有正确连接到过滤器。
GET /x_v2/_search/
{
"size": 0,
"aggs": {
"FilteredDescriptiveFeatures": {
"filter": {
"bool": {
"must": [
{
"terms": {
"breadcrumbs.categoryIds": [
"category"
]
}
},
{
"nested": {
"path": "products",
"query": {
"terms": {
"products.sterile": [
"0"
]
}
}
}
}
]
}
},
"aggs": {
"nested": {
"nested": {
"path": "products"
},
"aggregations": {
"DescriptiveFeatures": {
"terms": {
"field": "products.descriptiveFeatures",
"size": 1000
}
}
}
}
}
}
}
}
这给出了结果:
"aggregations": {
"FilteredDescriptiveFeatures": {
"doc_count": 280,
"nested": {
"doc_count": 1437,
"DescriptiveFeatures": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "somekey",
"doc_count": 164
},
我还尝试将嵌套定义放在更高的位置以同时包含过滤器和聚合,但是不在嵌套对象中的过滤器术语 breadcrumbs.categoryId 将不起作用。
我正在尝试做的事情是否可行?
以及如何解决?
我从描述中了解到,您想根据一些嵌套和非嵌套字段过滤结果,然后在嵌套字段上应用聚合。我使用一些嵌套和非嵌套字段创建了一个示例索引和数据,并创建了一个查询
映射
PUT stack-557722203
{
"mappings": {
"_doc": {
"properties": {
"category": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"user": {
"type": "nested", // NESTED FIELD
"properties": {
"fName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"lName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"type": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
}
}
}
示例数据
POST _bulk
{"index":{"_index":"stack-557722203","_id":"1","_type":"_doc"}}
{"category":"X","user":[{"fName":"A","lName":"B","type":"X"},{"fName":"A","lName":"C","type":"X"},{"fName":"P","lName":"B","type":"Y"}]}
{"index":{"_index":"stack-557722203","_id":"2","_type":"_doc"}}
{"category":"X","user":[{"fName":"P","lName":"C","type":"Z"}]}
{"index":{"_index":"stack-557722203","_id":"3","_type":"_doc"}}
{"category":"X","user":[{"fName":"A","lName":"C","type":"Y"}]}
{"index":{"_index":"stack-557722203","_id":"4","_type":"_doc"}}
{"category":"Y","user":[{"fName":"A","lName":"C","type":"Y"}]}
查询
GET stack-557722203/_search
{
"size": 0,
"query": {
"bool": {
"must": [
{
"nested": {
"path": "user",
"query": {
"term": {
"user.fName.keyword": {
"value": "A"
}
}
}
}
},
{
"term": {
"category.keyword": {
"value": "X"
}
}
}
]
}
},
"aggs": {
"group BylName": {
"nested": {
"path": "user"
},
"aggs": {
"group By lName": {
"terms": {
"field": "user.lName.keyword",
"size": 10
},
"aggs": {
"reverse Nested": {
"reverse_nested": {} // NOTE THIS
}
}
}
}
}
}
}
输出
{
"took": 18,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 0,
"hits": []
},
"aggregations": {
"group BylName": {
"doc_count": 4,
"group By lName": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "B",
"doc_count": 2,
"reverse Nested": {
"doc_count": 1
}
},
{
"key": "C",
"doc_count": 2,
"reverse Nested": {
"doc_count": 2
}
}
]
}
}
}
}
根据您获得的数据差异,当您将映射更改为 Nested
时,doc_count
中的更多文档是因为 Nested
和 Object(NonNested)
的方式] 文档被存储。请参阅 here to understand how are they internally stored. In order to connect them back to the root Document , you can use Reverse Nested 聚合,然后您将得到相同的结果。
希望对您有所帮助!!
在您的 FilteredDescriptiveFeatures
步骤中,您 return 所有包含一个产品 sterile = 0
的文档
但在 nested step
之后您不再指定此过滤器。因此,在此步骤中所有嵌套产品都是 return,因此您对所有产品进行术语聚合,而不仅仅是具有 sterile = 0
的产品
您应该在嵌套步骤中移动除菌过滤器。就像 Richa 指出的那样,您需要在最后一步使用 reverse_nested 聚合来计算 elasticsearch 文档而不是嵌套的产品子文档。
你能试试这个查询吗?
{
"size": 0,
"aggs": {
"filteredCategory": {
"filter": {
"terms": {
"breadcrumbs.categoryIds": [
"category"
]
}
},
"aggs": {
"nestedProducts": {
"nested": {
"path": "products"
},
"aggs": {
"filteredByProductsAttributes": {
"filter": {
"terms": {
"products.sterile": [
"0"
]
}
},
"aggs": {
"DescriptiveFeatures": {
"terms": {
"field": "products.descriptiveFeatures",
"size": 1000
},
"aggs": {
"productCount": {
"reverse_nested": {}
}
}
}
}
}
}
}
}
}
}
}
我们有分面显示单击过滤器(并组合它们)时将显示的结果数量。像这样:
在我们介绍嵌套对象之前,以下内容可以完成这项工作:
GET /x_v1/_search/
{
"size": 0,
"aggs": {
"FilteredDescriptiveFeatures": {
"filter": {
"bool": {
"must": [
{
"terms": {
"breadcrumbs.categoryIds": [
"category"
]
}
},
{
"terms": {
"products.sterile": [
"0"
]
}
}
]
}
},
"aggs": {
"DescriptiveFeatures": {
"terms": {
"field": "products.descriptiveFeatures",
"size": 1000
}
}
}
}
}
}
这给出了结果:
"aggregations": {
"FilteredDescriptiveFeatures": {
"doc_count": 280,
"DescriptiveFeatures": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "somekey",
"doc_count": 42
},
虽然我们需要使 products
成为一个嵌套对象,但我目前正在尝试重写上面的内容以适应这一变化。
我的尝试如下所示。虽然它没有给出正确的结果,而且似乎没有正确连接到过滤器。
GET /x_v2/_search/
{
"size": 0,
"aggs": {
"FilteredDescriptiveFeatures": {
"filter": {
"bool": {
"must": [
{
"terms": {
"breadcrumbs.categoryIds": [
"category"
]
}
},
{
"nested": {
"path": "products",
"query": {
"terms": {
"products.sterile": [
"0"
]
}
}
}
}
]
}
},
"aggs": {
"nested": {
"nested": {
"path": "products"
},
"aggregations": {
"DescriptiveFeatures": {
"terms": {
"field": "products.descriptiveFeatures",
"size": 1000
}
}
}
}
}
}
}
}
这给出了结果:
"aggregations": {
"FilteredDescriptiveFeatures": {
"doc_count": 280,
"nested": {
"doc_count": 1437,
"DescriptiveFeatures": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "somekey",
"doc_count": 164
},
我还尝试将嵌套定义放在更高的位置以同时包含过滤器和聚合,但是不在嵌套对象中的过滤器术语 breadcrumbs.categoryId 将不起作用。
我正在尝试做的事情是否可行? 以及如何解决?
我从描述中了解到,您想根据一些嵌套和非嵌套字段过滤结果,然后在嵌套字段上应用聚合。我使用一些嵌套和非嵌套字段创建了一个示例索引和数据,并创建了一个查询
映射
PUT stack-557722203
{
"mappings": {
"_doc": {
"properties": {
"category": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"user": {
"type": "nested", // NESTED FIELD
"properties": {
"fName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"lName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"type": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
}
}
}
示例数据
POST _bulk
{"index":{"_index":"stack-557722203","_id":"1","_type":"_doc"}}
{"category":"X","user":[{"fName":"A","lName":"B","type":"X"},{"fName":"A","lName":"C","type":"X"},{"fName":"P","lName":"B","type":"Y"}]}
{"index":{"_index":"stack-557722203","_id":"2","_type":"_doc"}}
{"category":"X","user":[{"fName":"P","lName":"C","type":"Z"}]}
{"index":{"_index":"stack-557722203","_id":"3","_type":"_doc"}}
{"category":"X","user":[{"fName":"A","lName":"C","type":"Y"}]}
{"index":{"_index":"stack-557722203","_id":"4","_type":"_doc"}}
{"category":"Y","user":[{"fName":"A","lName":"C","type":"Y"}]}
查询
GET stack-557722203/_search
{
"size": 0,
"query": {
"bool": {
"must": [
{
"nested": {
"path": "user",
"query": {
"term": {
"user.fName.keyword": {
"value": "A"
}
}
}
}
},
{
"term": {
"category.keyword": {
"value": "X"
}
}
}
]
}
},
"aggs": {
"group BylName": {
"nested": {
"path": "user"
},
"aggs": {
"group By lName": {
"terms": {
"field": "user.lName.keyword",
"size": 10
},
"aggs": {
"reverse Nested": {
"reverse_nested": {} // NOTE THIS
}
}
}
}
}
}
}
输出
{
"took": 18,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 0,
"hits": []
},
"aggregations": {
"group BylName": {
"doc_count": 4,
"group By lName": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "B",
"doc_count": 2,
"reverse Nested": {
"doc_count": 1
}
},
{
"key": "C",
"doc_count": 2,
"reverse Nested": {
"doc_count": 2
}
}
]
}
}
}
}
根据您获得的数据差异,当您将映射更改为 Nested
时,doc_count
中的更多文档是因为 Nested
和 Object(NonNested)
的方式] 文档被存储。请参阅 here to understand how are they internally stored. In order to connect them back to the root Document , you can use Reverse Nested 聚合,然后您将得到相同的结果。
希望对您有所帮助!!
在您的 FilteredDescriptiveFeatures
步骤中,您 return 所有包含一个产品 sterile = 0
但在 nested step
之后您不再指定此过滤器。因此,在此步骤中所有嵌套产品都是 return,因此您对所有产品进行术语聚合,而不仅仅是具有 sterile = 0
您应该在嵌套步骤中移动除菌过滤器。就像 Richa 指出的那样,您需要在最后一步使用 reverse_nested 聚合来计算 elasticsearch 文档而不是嵌套的产品子文档。
你能试试这个查询吗?
{
"size": 0,
"aggs": {
"filteredCategory": {
"filter": {
"terms": {
"breadcrumbs.categoryIds": [
"category"
]
}
},
"aggs": {
"nestedProducts": {
"nested": {
"path": "products"
},
"aggs": {
"filteredByProductsAttributes": {
"filter": {
"terms": {
"products.sterile": [
"0"
]
}
},
"aggs": {
"DescriptiveFeatures": {
"terms": {
"field": "products.descriptiveFeatures",
"size": 1000
},
"aggs": {
"productCount": {
"reverse_nested": {}
}
}
}
}
}
}
}
}
}
}
}