如何在弹性搜索中聚合数组中的匹配字段

Question

我的对象带有一个名为 properties 的数组。属性本身就是对象，由字段属性和值（以及其他几个在这里不重要的属性）组成。

我想查找某个属性的所有值。

我目前的方法是对 properties.attribute 使用过滤查询，然后对 properties.value 使用聚合。但这还不够，因为聚合使用了所有定义的属性，而不仅仅是搜索 properties.attribute.

的属性

有没有办法将聚合 'space' 限制为 properties.attribute 匹配的属性？

为了完整起见，这里是找到许多值的 curl 调用，我只对 'farbe'（颜色）感兴趣：

curl -XGET 'http://localhost:9200/pwo/Product/_search?size=0&pretty=true' -d '{
"query": {
  "filtered": {
    "query": { "match_all" : { } },
    "filter": {
      "bool": {
        "must": { "term": { "properties.attribute": "farbe" } }
      }
    }
  }
},
"aggregations": {
  "properties": {
    "terms": { "field": "properties.value" }
  }
 }
}'

Answer 1

nested aggregation and filter aggregation 的组合似乎可以满足您的要求，如果我理解正确的话。

不过，您必须使用 nested type 设置映射。

作为玩具示例，我设置了一个简单的索引，如下所示：

PUT /test_index
{
   "settings": {
      "number_of_shards": 1
   },
   "mappings": {
      "doc": {
         "properties": {
            "properties": {
               "type": "nested",
               "properties": {
                  "attribute": {
                     "type": "string"
                  },
                  "value": {
                     "type": "string"
                  }
               }
            }
         }
      }
   }
}

（请注意，这有点令人困惑，因为在本例中，"properties" 既是关键字又是属性定义。）

现在我可以索引一些文档了：

POST /test_index/doc/_bulk
{"index":{"_id":1}}
{"properties":[{"attribute":"lorem","value":"Donec a diam lectus."},{"attribute":"ipsum","value":"Sed sit amet ipsum mauris."}]}
{"index":{"_id":2}}
{"properties":[{"attribute":"dolor","value":"Donec et mollis dolor."},{"attribute":"sit","value":"Donec sed odio eros."}]}
{"index":{"_id":3}}
{"properties":[{"attribute":"amet","value":"Vivamus fermentum semper porta."}]}

然后我可以在 "properties.value" 上得到一个由 "properties.attribute" 过滤的聚合，如下所示：

POST /test_index/_search?search_type=count
{
   "aggs": {
      "nested_properties": {
         "nested": {
            "path": "properties"
         },
         "aggs": {
            "filtered_by_attribute": {
               "filter": {
                  "terms": {
                     "properties.attribute": [
                        "lorem",
                        "amet"
                     ]
                  }
               },
               "aggs": {
                  "value_terms": {
                     "terms": {
                        "field": "properties.value"
                     }
                  }
               }
            }
         }
      }
   }
}

在这种情况下 returns:

{
   "took": 3,
   "timed_out": false,
   "_shards": {
      "total": 1,
      "successful": 1,
      "failed": 0
   },
   "hits": {
      "total": 3,
      "max_score": 0,
      "hits": []
   },
   "aggregations": {
      "nested_properties": {
         "doc_count": 5,
         "filtered_by_attribute": {
            "doc_count": 2,
            "value_terms": {
               "doc_count_error_upper_bound": 0,
               "sum_other_doc_count": 0,
               "buckets": [
                  {
                     "key": "a",
                     "doc_count": 1
                  },
                  {
                     "key": "diam",
                     "doc_count": 1
                  },
                  {
                     "key": "donec",
                     "doc_count": 1
                  },
                  {
                     "key": "fermentum",
                     "doc_count": 1
                  },
                  {
                     "key": "lectus",
                     "doc_count": 1
                  },
                  {
                     "key": "porta",
                     "doc_count": 1
                  },
                  {
                     "key": "semper",
                     "doc_count": 1
                  },
                  {
                     "key": "vivamus",
                     "doc_count": 1
                  }
               ]
            }
         }
      }
   }
}

这是我一起使用的代码：

http://sense.qbox.io/gist/1e0c58aae54090fadfde8856f4f6793b68de0167

如何在弹性搜索中聚合数组中的匹配字段

How to aggregate over matched fields in an array in elastic search

elasticsearch