弹性搜索聚合组值

elastic search aggregation group values

我的文档结构如下:

{
"title" : "A title",
"ExtraFields": [
    {
        "value": "print",
        "fieldID": "5535627631efa0843554b0ea"
    }
    ,
    {
        "value": "POLYE",
        "fieldID": "5535627631efa0843554b0ec"
    }
    ,
    {
        "value": "30",
        "fieldID": "5535627631efa0843554b0ed"
    }
    ,
    {
        "value": "0",
        "fieldID": "5535627631efa0843554b0ee"
    }
    ,
    {
        "value": "0",
        "fieldID": "5535627731efa0843554b0ef"
    }
    ,
    {
        "value": "0.42",
        "fieldID": "5535627831efa0843554b0f0"
    }
    ,
    {
        "value": "40",
        "fieldID": "5535627831efa0843554b0f1"
    }
    ,
    {
        "value": "30",
        "fieldID": "5535627831efa0843554b0f2"
    }
    ,
    {
        "value": "18",
        "fieldID": "5535627831efa0843554b0f3"
    }
    ,
    {
        "value": "24",
        "fieldID": "5535627831efa0843554b0f4"
    }
]
}

理想的输出是(最好的情况):

[
{
    "field" : "5535627831efa0843554b0f4",
    "values" : [
        {
            "label" : "24",
            "count" : 2
        },
        {
            "label" : "18",
            "count" : 5
        }
    ]
},
{
    "field" : "5535627831efa0843554b0f3",
    "values" : [
        {
            "label" : "cott",
            "count" : 20
        },
        {
            "label" : "polye",
            "count" : 12
        }
    ]
}
]

但我也可以做一个更简单的,比如(这就是我现在在 mongodb 中得到它的方式):

[
{
    "field" : "5535627831efa0843554b0f4",
    "value" : "24",
    "count" : 2
},
{
    "field" : "5535627831efa0843554b0f4",
    "value" : "18",
    "count" : 5
},
{
    "field" : "5535627831efa0843554b0f3",
    "value" : "cott",
    "count" : 20
},
{
    "field" : "5535627831efa0843554b0f3",
    "value" : "polye",
    "count" : 12
}
] 

聚合查询会是什么样子?此结构的任何特殊映射?

为了得到你想要的,你需要 nested 映射到 ExtraFields 子结构。您的文档映射将如下所示(doctype 是我选择命名您的文档类型的术语,但它可以是您现在拥有的任何名称):

PUT /test/_mapping/doctype
{
  "doctype": {
    "properties": {
      "title": {
        "type": "string"
      },
      "ExtraFields": {
        "type": "nested",
        "properties": {
          "value": {
            "type": "string",
            "index": "not_analyzed"
          },
          "fieldID": {
            "type": "string",
            "index": "not_analyzed"
          }
        }
      }
    }
  }
}

然后,您可以索引您的文档

PUT /test/doctype/123
{
    "title" : "A title",
    "ExtraFields": [
       ...
    ]
}

并发送以下聚合查询:

POST /test/doctype/_search
{
  "size": 0,
  "aggs": {
    "fields": {
      "nested": {
        "path": "ExtraFields"
      },
      "aggs": {
        "fields": {
          "terms": {
            "field": "ExtraFields.fieldID"
          },
          "aggs": {
            "values": {
              "terms": {
                "field": "ExtraFields.value"
              }
            }
          }
        }
      }
    }
  }
}

这将产生您在最佳情况下突出显示的结果,尽管响应中的 JSON 字段名称命名有点不同,但我想没关系。

试一试并告诉我们。