如何在 Elasticsearch 中对不同的记录进行分组
How to group distinct records in Elasticsearch
我的 Elasticsearch 索引中有以下数据:
{
"title": "Hello from elastic",
"name": "ABC",
"j_id": "1",
"date": '2021-03-02T12:29:31.356514'
},
{
"title": "Hello from elastic",
"name": "PQR",
"j_id": "1",
"date": '2021-03-02T12:29:31.356514'
},
{
"title": "Hello from elastic",
"name": "XYZ",
"j_id": "2",
"date": '2021-03-02T12:29:31.356514'
},
{
"title": "Hello from elastic",
"name": "MNO",
"j_id": "3",
"date": '2021-03-02T12:29:31.356514'
}
现在想在id
的基础上得到唯一记录。
预期输出为:
{
"1": [{
"title": "Hello from elastic",
"name": "ABC",
"j_id": "1",
"date": '2021-03-02T12:29:31.356514'
},
{
"title": "Hello from elastic",
"name": "PQR",
"j_id": "1",
"date": '2021-03-02T12:29:31.356514'
}],
"2": [{
"title": "Hello from elastic",
"name": "XYZ",
"j_id": "2",
"date": '2021-03-02T12:29:31.356514'
}],
"3": [{
"title": "Hello from elastic",
"name": "MNO",
"j_id": "3",
"date": '2021-03-02T12:29:31.356514'
}]
}
我尝试了聚合查询,但它只提供了计数。
另外,我想在回复中包含最新记录。
- 如何从 Elasticsearch 中获取按
id
分组的唯一记录?
- 我要先插入最新的数据
假设覆盖 date
和 j_id
字段的最小映射:
PUT myindex
{
"mappings": {
"properties": {
"j_id": {
"type": "keyword"
},
"date": {
"type": "date"
}
}
}
}
您可以利用 terms
aggregation whose sub-aggregation is an ordered top_hits
aggregation:
POST myindex/_search?filter_path=aggregations.*.buckets.key,aggregations.*.buckets.sorted_hits.hits.hits._source
{
"size": 0,
"aggs": {
"by_j_id": {
"terms": {
"field": "j_id",
"size": 10,
"order": {
"max_date": "desc"
}
},
"aggs": {
"max_date": {
"max": {
"field": "date"
}
},
"sorted_hits": {
"top_hits": {
"size": 10,
"sort": [
{
"date": {
"order": "desc"
}
}
]
}
}
}
}
}
}
URL 参数 filter_path
减少了响应主体以紧密模仿您所需的格式:
{
"aggregations" : {
"by_j_id" : {
"buckets" : [
{
"key" : "1",
"sorted_hits" : {
"hits" : {
"hits" : [
{
"_source" : {
"title" : "Hello from elastic",
"name" : "ABC",
"j_id" : "1",
"date" : "2021-03-02T12:29:31.356514"
}
},
{
"_source" : {
"title" : "Hello from elastic",
"name" : "PQR",
"j_id" : "1",
"date" : "2021-03-02T12:29:31.356514"
}
}
]
}
}
},
{
"key" : "2",
"sorted_hits" : {
"hits" : {
"hits" : [
{
"_source" : {
"title" : "Hello from elastic",
"name" : "XYZ",
"j_id" : "2",
"date" : "2021-03-02T12:29:31.356514"
}
}
]
}
}
},
{
"key" : "3",
"sorted_hits" : {
"hits" : {
"hits" : [
{
"_source" : {
"title" : "Hello from elastic",
"name" : "MNO",
"j_id" : "3",
"date" : "2021-03-02T12:29:31.356514"
}
}
]
}
}
}
]
}
}
}
我的 Elasticsearch 索引中有以下数据:
{
"title": "Hello from elastic",
"name": "ABC",
"j_id": "1",
"date": '2021-03-02T12:29:31.356514'
},
{
"title": "Hello from elastic",
"name": "PQR",
"j_id": "1",
"date": '2021-03-02T12:29:31.356514'
},
{
"title": "Hello from elastic",
"name": "XYZ",
"j_id": "2",
"date": '2021-03-02T12:29:31.356514'
},
{
"title": "Hello from elastic",
"name": "MNO",
"j_id": "3",
"date": '2021-03-02T12:29:31.356514'
}
现在想在id
的基础上得到唯一记录。
预期输出为:
{
"1": [{
"title": "Hello from elastic",
"name": "ABC",
"j_id": "1",
"date": '2021-03-02T12:29:31.356514'
},
{
"title": "Hello from elastic",
"name": "PQR",
"j_id": "1",
"date": '2021-03-02T12:29:31.356514'
}],
"2": [{
"title": "Hello from elastic",
"name": "XYZ",
"j_id": "2",
"date": '2021-03-02T12:29:31.356514'
}],
"3": [{
"title": "Hello from elastic",
"name": "MNO",
"j_id": "3",
"date": '2021-03-02T12:29:31.356514'
}]
}
我尝试了聚合查询,但它只提供了计数。 另外,我想在回复中包含最新记录。
- 如何从 Elasticsearch 中获取按
id
分组的唯一记录? - 我要先插入最新的数据
假设覆盖 date
和 j_id
字段的最小映射:
PUT myindex
{
"mappings": {
"properties": {
"j_id": {
"type": "keyword"
},
"date": {
"type": "date"
}
}
}
}
您可以利用 terms
aggregation whose sub-aggregation is an ordered top_hits
aggregation:
POST myindex/_search?filter_path=aggregations.*.buckets.key,aggregations.*.buckets.sorted_hits.hits.hits._source
{
"size": 0,
"aggs": {
"by_j_id": {
"terms": {
"field": "j_id",
"size": 10,
"order": {
"max_date": "desc"
}
},
"aggs": {
"max_date": {
"max": {
"field": "date"
}
},
"sorted_hits": {
"top_hits": {
"size": 10,
"sort": [
{
"date": {
"order": "desc"
}
}
]
}
}
}
}
}
}
URL 参数 filter_path
减少了响应主体以紧密模仿您所需的格式:
{
"aggregations" : {
"by_j_id" : {
"buckets" : [
{
"key" : "1",
"sorted_hits" : {
"hits" : {
"hits" : [
{
"_source" : {
"title" : "Hello from elastic",
"name" : "ABC",
"j_id" : "1",
"date" : "2021-03-02T12:29:31.356514"
}
},
{
"_source" : {
"title" : "Hello from elastic",
"name" : "PQR",
"j_id" : "1",
"date" : "2021-03-02T12:29:31.356514"
}
}
]
}
}
},
{
"key" : "2",
"sorted_hits" : {
"hits" : {
"hits" : [
{
"_source" : {
"title" : "Hello from elastic",
"name" : "XYZ",
"j_id" : "2",
"date" : "2021-03-02T12:29:31.356514"
}
}
]
}
}
},
{
"key" : "3",
"sorted_hits" : {
"hits" : {
"hits" : [
{
"_source" : {
"title" : "Hello from elastic",
"name" : "MNO",
"j_id" : "3",
"date" : "2021-03-02T12:29:31.356514"
}
}
]
}
}
}
]
}
}
}