如何从 Elasticsearch 获取字段的个体计数
How to get the individual count of field from Elasticsearch
我的字典内容如下
test=
[ { 'masterid': '1', 'name': 'Group1', 'BusinessArea': [ { 'id': '14', 'name': 'Accounting', 'parentname': 'Finance'}, { 'id': '3', 'name': 'Research', 'parentname': 'R & D' } ], 'Designation': [ { 'id': '16', 'name': 'L1' }, { 'id': '20', 'name': 'L2' }, { 'id': '25', 'name': 'L2' }] },
{ 'masterid': '2', 'name': 'Group1', 'BusinessArea': [ { 'id': '14', 'name': 'Research', 'parentname': '' }, { 'id': '3', 'name': 'Accounting', 'parentname': '' } ], 'Role': [ { 'id': '5032', 'name': 'Tester' }, { 'id': '5033', 'name': 'Developer' } ], 'Designation': [ { 'id': '16', 'name': 'L1' }, { 'id': '20', 'name': 'L2' }, { 'id': '25', 'name': 'L2' }]},
{ 'masterid': '3', 'name': 'Group1', 'BusinessArea': [ { 'id': '14', 'name': 'Engineering' }, { 'id': '3', 'name': 'Engineering', 'parentname': '' } ], 'Role': [ { 'id': '5032', 'name': 'Developer' }, { 'id': '5033', 'name': 'Developer', 'parentname': '' } ], 'Designation': [ { 'id': '16', 'name': 'L1' }, { 'id': '20', 'name': 'L2' }, { 'id': '25', 'name': 'L2' }]}]
下面的代码用于放入弹性搜索索引
from elasticsearch import Elasticsearch
es = Elasticsearch()
es.indices.create(index='new')
for e in test:
es.index(index="new", body=e, id=e['id'])
我想获取 BusinessArea
的 masterid 的计数,即所有名称
这里是Accounting
,Research
Engineering
[ {
"name": "BusinessArea",
"values": [
{
"name": "Accounting",
"count": "2"
},
{
"name": "Research",
"count": "2"
},
{
"name": "Engineering",
"count": "1"
}]
}]
或者我可以得到如下答案吗
{
"A": {
"Designation": [{
"key": "L1",
"doc_count": 3
},
{
"key": "L2",
"doc_count": 3
}
]
},
{
"B": {
"BusinessArea": [{
"key": "Accounting",
"doc_count": 2
},
{
"key": "Research",
"doc_count": 2
},
{
"key": "Engineering",
"doc_count": 1
}
]
}
}
您可以简单地使用 count API of elasticsearch 来获取 elasticsearch 索引中所有文档的计数或基于同一文档中显示的条件。
对于你的情况,应该是
GET /<your-index-name>/_count?q=name:BusinessArea
或者,如果 masterid
是文档中的 Unique-id,您只需使用
GET /<your-index-name>/_count
如果您想获得字段的单个计数,您可以使用 terms aggregation 这是一个 multi-bucket 值 source-based 聚合,其中存储桶是动态构建的 - 每个唯一值一个.
搜索查询:
{
"size":0,
"aggs": {
"countNames": {
"terms": {
"field": "BusinessArea.name.keyword"
}
}
}
}
搜索结果:
"aggregations": {
"countNames": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "Accounting",
"doc_count": 2
},
{
"key": "Research",
"doc_count": 2
},
{
"key": "Engineering",
"doc_count": 1
}
]
}
更新 1:
如果您想单独计算 Designation
和 BusinessArea
的字段
搜索查询:
{
"size": 0,
"aggs": {
"countNames": {
"terms": {
"field": "BusinessArea.name.keyword"
}
},
"designationNames": {
"terms": {
"field": "Designation.name.keyword"
}
}
}
}
搜索结果:
"aggregations": {
"designationNames": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "L1",
"doc_count": 3
},
{
"key": "L2",
"doc_count": 3
}
]
},
"countNames": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "Accounting",
"doc_count": 2
},
{
"key": "Research",
"doc_count": 2
},
{
"key": "Engineering",
"doc_count": 1
}
]
}
我的字典内容如下
test=
[ { 'masterid': '1', 'name': 'Group1', 'BusinessArea': [ { 'id': '14', 'name': 'Accounting', 'parentname': 'Finance'}, { 'id': '3', 'name': 'Research', 'parentname': 'R & D' } ], 'Designation': [ { 'id': '16', 'name': 'L1' }, { 'id': '20', 'name': 'L2' }, { 'id': '25', 'name': 'L2' }] },
{ 'masterid': '2', 'name': 'Group1', 'BusinessArea': [ { 'id': '14', 'name': 'Research', 'parentname': '' }, { 'id': '3', 'name': 'Accounting', 'parentname': '' } ], 'Role': [ { 'id': '5032', 'name': 'Tester' }, { 'id': '5033', 'name': 'Developer' } ], 'Designation': [ { 'id': '16', 'name': 'L1' }, { 'id': '20', 'name': 'L2' }, { 'id': '25', 'name': 'L2' }]},
{ 'masterid': '3', 'name': 'Group1', 'BusinessArea': [ { 'id': '14', 'name': 'Engineering' }, { 'id': '3', 'name': 'Engineering', 'parentname': '' } ], 'Role': [ { 'id': '5032', 'name': 'Developer' }, { 'id': '5033', 'name': 'Developer', 'parentname': '' } ], 'Designation': [ { 'id': '16', 'name': 'L1' }, { 'id': '20', 'name': 'L2' }, { 'id': '25', 'name': 'L2' }]}]
下面的代码用于放入弹性搜索索引
from elasticsearch import Elasticsearch
es = Elasticsearch()
es.indices.create(index='new')
for e in test:
es.index(index="new", body=e, id=e['id'])
我想获取 BusinessArea
的 masterid 的计数,即所有名称
这里是Accounting
,Research
Engineering
[ {
"name": "BusinessArea",
"values": [
{
"name": "Accounting",
"count": "2"
},
{
"name": "Research",
"count": "2"
},
{
"name": "Engineering",
"count": "1"
}]
}]
或者我可以得到如下答案吗
{
"A": {
"Designation": [{
"key": "L1",
"doc_count": 3
},
{
"key": "L2",
"doc_count": 3
}
]
},
{
"B": {
"BusinessArea": [{
"key": "Accounting",
"doc_count": 2
},
{
"key": "Research",
"doc_count": 2
},
{
"key": "Engineering",
"doc_count": 1
}
]
}
}
您可以简单地使用 count API of elasticsearch 来获取 elasticsearch 索引中所有文档的计数或基于同一文档中显示的条件。
对于你的情况,应该是
GET /<your-index-name>/_count?q=name:BusinessArea
或者,如果 masterid
是文档中的 Unique-id,您只需使用
GET /<your-index-name>/_count
如果您想获得字段的单个计数,您可以使用 terms aggregation 这是一个 multi-bucket 值 source-based 聚合,其中存储桶是动态构建的 - 每个唯一值一个.
搜索查询:
{
"size":0,
"aggs": {
"countNames": {
"terms": {
"field": "BusinessArea.name.keyword"
}
}
}
}
搜索结果:
"aggregations": {
"countNames": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "Accounting",
"doc_count": 2
},
{
"key": "Research",
"doc_count": 2
},
{
"key": "Engineering",
"doc_count": 1
}
]
}
更新 1:
如果您想单独计算 Designation
和 BusinessArea
搜索查询:
{
"size": 0,
"aggs": {
"countNames": {
"terms": {
"field": "BusinessArea.name.keyword"
}
},
"designationNames": {
"terms": {
"field": "Designation.name.keyword"
}
}
}
}
搜索结果:
"aggregations": {
"designationNames": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "L1",
"doc_count": 3
},
{
"key": "L2",
"doc_count": 3
}
]
},
"countNames": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "Accounting",
"doc_count": 2
},
{
"key": "Research",
"doc_count": 2
},
{
"key": "Engineering",
"doc_count": 1
}
]
}