基于每个集合的不同术语的术语聚合

Terms Aggregation based on Distinct Terms per Collection

我有这样的文档:

{
  "foo": null,
  "bars": [
    {
      "baz": "BAZ",
      "qux": null,
      "bears": [
        {
          "fruit": "banana"
        }
      ]
    }
  ]
}

我想要 fruit 项的存储桶,其中包含 bars 下包含 bears 的文档数量,每个给定 fruit。例如,给定以下文档:

{
  "foo": null,
  "bars": [
    {
      "baz": "BAZ",
      "qux": null,
      "bears": [
        {
          "fruit": "banana"
        },
        {
          "fruit": "banana"
        },
        {
          "fruit": "apple"
        }
      ]
    },
    {
      "baz": "BAZ",
      "qux": null,
      "bears": [
        {
          "fruit": "banana"
        }
      ]
    }
  ]
}
{
  "foo": null,
  "bars": [
    {
      "baz": "BAZ",
      "qux": null,
      "bears": [
        {
          "fruit": "apple"
        },
        {
          "fruit": "apple"
        },
        {
          "fruit": "orange"
        }
      ]
    }
  ]
}

我想要这样的结果:

"buckets": [
  {
    "key": "banana",
    "doc_count": 2
  },
  {
    "key": "apple",
    "doc_count": 2
  },
  {
    "key": "orange",
    "doc_count": 1
  }
]

也就是说,banana 显示为 2 distinct bars 的后代,apple 显示为 2 distinct [=17= 的后代],并且 orange 显示为 1 distinct bar.

的后代

现在我有以下聚合,计算水果总数:

{
  "aggs": {
    "global": {
      "global": {},
      "aggs": {
        "bars": {
          "nested": {
            "path": "bars"
          },
          "aggs": {
            "bears": {
              "nested": {
                "path": "bars.bears"
              },
              "aggs": {
                "fruits": {
                  "terms": {
                    "field": "bars.bears.fruit"
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

结果如下:

"buckets": [
  {
    "key": "banana",
    "doc_count": 3
  },
  {
    "key": "apple",
    "doc_count": 3
  },
  {
    "key": "orange",
    "doc_count": 1
  }
]

这不是我要找的。是否可以修改此查询以计算包含每个 fruit 的不同 bars

添加一个工作示例,其中包含索引数据(与问题中显示的相同)、映射、搜索查询和搜索结果

索引映射:

{
  "mappings": {
    "properties": {
      "bars": {
        "type": "nested",
        "properties": {
          "bears": {
            "properties": {
              "fruit": {
                "type": "keyword"
              }
            }
          }
        }
      }
    }
  }
}

搜索查询:

{
  "size": 0,
  "aggs": {
    "bars": {
      "nested": {
        "path": "bars"
      },
      "aggs": {
        "fruits": {
          "terms": {
            "field": "bars.bears.fruit"
          }
        }
      }
    }
  }
}

搜索结果:

"aggregations": {
    "bars": {
      "doc_count": 3,
      "fruits": {
        "doc_count_error_upper_bound": 0,
        "sum_other_doc_count": 0,
        "buckets": [
          {
            "key": "apple",
            "doc_count": 2
          },
          {
            "key": "banana",
            "doc_count": 2
          },
          {
            "key": "orange",
            "doc_count": 1
          }
        ]
      }
    }
  }

我实际上设法得到了我正在寻找的结果,尽管形状略有不同:

查询

{
  "aggs": {
    "global": {
      "global": {},
      "aggs": {
        "bars": {
          "nested": {
            "path": "bars"
          },
          "aggs": {
            "bears": {
              "nested": {
                "path": "bars.bears"
              },
              "aggs": {
                "fruits": {
                  "terms": {
                    "field": "bars.bears.fruit"
                  },
                  "fruit_to_bears": {
                    "reverse_nested": {}
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

结果

"buckets": [
  {
    "key": "banana",
    "doc_count": 3,
    "fruit_to_bears": {
      "doc_count": 2
    }
  },
  {
    "key": "apple",
    "doc_count": 3,
    "fruit_to_bears": {
      "doc_count": 2
    }
  },
  {
    "key": "orange",
    "doc_count": 1,
    "fruit_to_bears": {
      "doc_count": 1
    }
  }
]