Elasticsearch:聚合已知的对象键(不是值)
Elasticsearch: Aggregate known object keys (not values)
我的 Elasticsearch 有一个包含如下文档的索引:
[{
"_index": "products",
"_type": "product",
"_id": "100",
"_score": 1,
"_source": {
"id": "100",
"name": "Product 1",
"catalogue": {
"categories": {
"cat1": ['h1', 'spin2'],
"cat5": ['h2', 'spin2']
}
}
}
},
{
"_index": "products",
"_type": "product",
"_id": "100",
"_score": 1,
"_source": {
"id": "100",
"name": "Product 1",
"catalogue": {
"categories": {
"cat2": ['d1', 'spin2'],
"cat5": ['h2', 'spin2']
}
}
}
}]
我需要汇总 known categories
。以上的预期结果是:
"aggregations": {
"categories": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "cat1",
"doc_count": 1
},
{
"key": "cat2",
"doc_count": 1
},
{
"key": "cat5",
"doc_count": 2
},
]
}
}
我应该如何定义搜索调用?
GET _search
{
"aggregations": {
"categories": {
"terms": {
???
}
}
}
}
更新:
我应该像下面那样使用 script
键。这可能会对性能产生影响,对吧?
GET _search
{
"aggregations": {
"categories": {
"terms": {
"script" : "????"
}
}
}
你可以这样做
GET /products/product/_search?search_type=count
{
"aggs": {
"cats": {
"terms": {
"script": "categories=_source.catalogue.categories;terms=[];for(categ in categories.keySet())terms+=categ;return terms"
}
}
}
}
但是,是的,它会对性能产生影响。您需要对此进行测试并查看其行为方式。确保多次 运行 相同的查询,因为第一次 return 可能需要更长的时间,这是正常的。
我的 Elasticsearch 有一个包含如下文档的索引:
[{
"_index": "products",
"_type": "product",
"_id": "100",
"_score": 1,
"_source": {
"id": "100",
"name": "Product 1",
"catalogue": {
"categories": {
"cat1": ['h1', 'spin2'],
"cat5": ['h2', 'spin2']
}
}
}
},
{
"_index": "products",
"_type": "product",
"_id": "100",
"_score": 1,
"_source": {
"id": "100",
"name": "Product 1",
"catalogue": {
"categories": {
"cat2": ['d1', 'spin2'],
"cat5": ['h2', 'spin2']
}
}
}
}]
我需要汇总 known categories
。以上的预期结果是:
"aggregations": {
"categories": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "cat1",
"doc_count": 1
},
{
"key": "cat2",
"doc_count": 1
},
{
"key": "cat5",
"doc_count": 2
},
]
}
}
我应该如何定义搜索调用?
GET _search
{
"aggregations": {
"categories": {
"terms": {
???
}
}
}
}
更新:
我应该像下面那样使用 script
键。这可能会对性能产生影响,对吧?
GET _search
{
"aggregations": {
"categories": {
"terms": {
"script" : "????"
}
}
}
你可以这样做
GET /products/product/_search?search_type=count
{
"aggs": {
"cats": {
"terms": {
"script": "categories=_source.catalogue.categories;terms=[];for(categ in categories.keySet())terms+=categ;return terms"
}
}
}
}
但是,是的,它会对性能产生影响。您需要对此进行测试并查看其行为方式。确保多次 运行 相同的查询,因为第一次 return 可能需要更长的时间,这是正常的。