聚合(一个字段中的多个值)elasticsearch
aggregation (many value in one field) elasticsearch
我在一个字段中有很多值,当我进行聚合时,我收到这些值作为单独的值。
例子:
name : jess , Region : new york
name : jess , Region : poland
要求:
query = {
"size": total,
"aggs": {
"buckets_for_name": {
"terms": {
"field": "name",
"size": total
},
"aggs": {
"region_terms": {
"terms": {
"field": "region",
"size": total
}
}
}
}
}
}
和 response["aggregations"]["buckets_for_name"]["buckets"]
我得到 :
{'key': 'jess ', 'doc_count': 61, 'region_terms': {'doc_count_error_upper_bound': 0, 'sum_other_doc_count': 0, 'buckets': [{'key': 'oran', 'doc_count': 60}, {'key': 'new ', 'doc_count': 1}, {'key': 'york', 'doc_count': 1}]}}, {'key': 'jess ', 'doc_count': 50, 'egion_terms': {'doc_count_error_upper_bound': 0, 'sum_other_doc_count': 0, 'buckets': [{'key': 'poland', 'doc_count': 50}]}}
和
pretty_results = []
for result in response["aggregations"]["buckets_for_name"]["buckets"]:
d = dict()
d["name"] = result["key"]
d["region"] = []
for region in result["region_terms"]["buckets"]:
d["region "].append(region ["key"])
pretty_results.append(d)
print(d)
我得到 :
{'name': 'jess ', 'region ': ['new' , 'york', 'poland']}
我想得到这个结果:
{'name': 'jess ', 'region ': ['new york', 'poland']}
region
(我假设 name
)字段是使用标准分析器分析的,该分析器将 new york
拆分为标记 [new
、york
].
您可能想要做的是设置一个 keyword
映射以将字符串视为独立标记:
PUT regions
{
"mappings": {
"properties": {
"name": {
"type": "text",
"fielddata": true,
"fields": {
"keyword": {
"type": "keyword"
}
}
},
"region": {
"type": "text",
"fielddata": true,
"fields": {
"keyword": {
"type": "keyword"
}
}
}
}
}
}
之后,在 .keyword
字段上执行聚合:
{
"size": 200,
"aggs": {
"buckets_for_name": {
"terms": {
"field": "name.keyword", <---
"size": 200
},
"aggs": {
"region_terms": {
"terms": {
"field": "region.keyword", <---
"size": 200
}
}
}
}
}
}
如果您想保留 newyork
space-less,请查看分析器中的 过滤器。
根据评论编辑
Aggs 不是查询的一部分——它们有自己的范围——所以改变这个
{
"query": {
"aggs": {
"buckets_for_name": {
至此
{
"query": {
// possibly leave the whole query attribute out
},
"aggs": {
"buckets_for_name": {
...
我在一个字段中有很多值,当我进行聚合时,我收到这些值作为单独的值。
例子:
name : jess , Region : new york
name : jess , Region : poland
要求:
query = {
"size": total,
"aggs": {
"buckets_for_name": {
"terms": {
"field": "name",
"size": total
},
"aggs": {
"region_terms": {
"terms": {
"field": "region",
"size": total
}
}
}
}
}
}
和 response["aggregations"]["buckets_for_name"]["buckets"]
我得到 :
{'key': 'jess ', 'doc_count': 61, 'region_terms': {'doc_count_error_upper_bound': 0, 'sum_other_doc_count': 0, 'buckets': [{'key': 'oran', 'doc_count': 60}, {'key': 'new ', 'doc_count': 1}, {'key': 'york', 'doc_count': 1}]}}, {'key': 'jess ', 'doc_count': 50, 'egion_terms': {'doc_count_error_upper_bound': 0, 'sum_other_doc_count': 0, 'buckets': [{'key': 'poland', 'doc_count': 50}]}}
和
pretty_results = []
for result in response["aggregations"]["buckets_for_name"]["buckets"]:
d = dict()
d["name"] = result["key"]
d["region"] = []
for region in result["region_terms"]["buckets"]:
d["region "].append(region ["key"])
pretty_results.append(d)
print(d)
我得到 :
{'name': 'jess ', 'region ': ['new' , 'york', 'poland']}
我想得到这个结果:
{'name': 'jess ', 'region ': ['new york', 'poland']}
region
(我假设 name
)字段是使用标准分析器分析的,该分析器将 new york
拆分为标记 [new
、york
].
您可能想要做的是设置一个 keyword
映射以将字符串视为独立标记:
PUT regions
{
"mappings": {
"properties": {
"name": {
"type": "text",
"fielddata": true,
"fields": {
"keyword": {
"type": "keyword"
}
}
},
"region": {
"type": "text",
"fielddata": true,
"fields": {
"keyword": {
"type": "keyword"
}
}
}
}
}
}
之后,在 .keyword
字段上执行聚合:
{
"size": 200,
"aggs": {
"buckets_for_name": {
"terms": {
"field": "name.keyword", <---
"size": 200
},
"aggs": {
"region_terms": {
"terms": {
"field": "region.keyword", <---
"size": 200
}
}
}
}
}
}
如果您想保留 newyork
space-less,请查看分析器中的
根据评论编辑 Aggs 不是查询的一部分——它们有自己的范围——所以改变这个
{
"query": {
"aggs": {
"buckets_for_name": {
至此
{
"query": {
// possibly leave the whole query attribute out
},
"aggs": {
"buckets_for_name": {
...