如何从 Elasticsearch 索引中的两个字段派生一个字段?
How to derive a field from two fields in an Elasticsearch index?
我有一个包含以下字段的索引:
- room_name
- start_date(开始时间房间被占用)
- end_date(结束时间房间已用)
我正在创建一个 curl 命令,我可以在其中获取使用房间的时间。
可能吗?
这是当前的 curl 命令:
curl -XGET "https://localhost:9200/testindex/_search?pretty" -H 'Content-Type: application/json' -d'
{
"aggs": {
"room_bucket":{
"terms": {
"field": "room_name.keyword",
},
"aggs":{
"hour_bucket": {
"terms": {
"script": {
"inline": "def l = doc[\"start_date \"].value;\nif ( l <= 20 && l >= 9 ) {\n return l;\n}",
"lang": "painless"
},
"order": {
"_key": "asc"
},
"value_type": "long"
}
}
}
}
}
}'
结果如下:
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 3,
"max_score" : 1.0,
"hits" : [
{
"_index" : "testindex",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"log_version" : 1,
"start_date" : 10,
"end_date" : 11,
"room_name" : "room_Y"
}
},
{
"_index" : "testindex",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"log_version" : 1,
"start_date" : 11,
"end_date" : 13,
"room_name" : "room_V"
}
},
{
"_index" : "testindex",
"_type" : "_doc",
"_id" : "3",
"_score" : 1.0,
"_source" : {
"log_version" : 1,
"start_date" : 10,
"end_date" : 12,
"room_name" : "room_Y"
}
}
]
},
"aggregations" : {
"room_bucket" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "room_V",
"doc_count" : 1,
"hour_bucket" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : 11,
"doc_count" : 1
}
]
}
},
{
"key" : "room_Y",
"doc_count" : 1,
"hour_bucket" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : 10,
"doc_count" : 1
}
]
}
}
]
}
}
}
但我在“聚合”中的预期结果如下:
"aggregations" : {
"room_bucket" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "room_V",
"doc_count" : 1,
"hour_bucket" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : 11,
"doc_count" : 1
},
{
"key" : 12,
"doc_count" : 1
},
{
"key" : 13,
"doc_count" : 1
}
]
}
},
{
"key" : "room_Y",
"doc_count" : 1,
"hour_bucket" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : 10,
"doc_count" : 2
},
{
"key" : 11,
"doc_count" : 2
},
{
"key" : 12,
"doc_count" : 1
}
]
}
}
]
}
}
在当前结果中,它只读取了start_date。
然而,在预期的输出中,Room_V 应该有 "key" = 11, "key" = 12, "key" = 13 (doc_count应该是每把钥匙 1 个)因为根据 start_date 和 end_date,房间的使用时间为 11 - 13。
您可以通过利用 LongStream
并创建一个间隔内所有小时数的数组来实现您想要的效果,如下所示:
curl -XGET "https://localhost:9200/testindex/_search?pretty" -H 'Content-Type: application/json' -d'
{
"aggs": {
"room_bucket": {
"terms": {
"field": "room_name.keyword"
},
"aggs": {
"hour_bucket": {
"terms": {
"script": {
"inline": """
return LongStream.rangeClosed(doc.start_date.value, doc.end_date.value).toArray();
""",
"lang": "painless"
},
"order": {
"_key": "asc"
},
"value_type": "long"
}
}
}
}
}
}'
我有一个包含以下字段的索引:
- room_name
- start_date(开始时间房间被占用)
- end_date(结束时间房间已用)
我正在创建一个 curl 命令,我可以在其中获取使用房间的时间。
可能吗?
这是当前的 curl 命令:
curl -XGET "https://localhost:9200/testindex/_search?pretty" -H 'Content-Type: application/json' -d'
{
"aggs": {
"room_bucket":{
"terms": {
"field": "room_name.keyword",
},
"aggs":{
"hour_bucket": {
"terms": {
"script": {
"inline": "def l = doc[\"start_date \"].value;\nif ( l <= 20 && l >= 9 ) {\n return l;\n}",
"lang": "painless"
},
"order": {
"_key": "asc"
},
"value_type": "long"
}
}
}
}
}
}'
结果如下:
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 3,
"max_score" : 1.0,
"hits" : [
{
"_index" : "testindex",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"log_version" : 1,
"start_date" : 10,
"end_date" : 11,
"room_name" : "room_Y"
}
},
{
"_index" : "testindex",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"log_version" : 1,
"start_date" : 11,
"end_date" : 13,
"room_name" : "room_V"
}
},
{
"_index" : "testindex",
"_type" : "_doc",
"_id" : "3",
"_score" : 1.0,
"_source" : {
"log_version" : 1,
"start_date" : 10,
"end_date" : 12,
"room_name" : "room_Y"
}
}
]
},
"aggregations" : {
"room_bucket" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "room_V",
"doc_count" : 1,
"hour_bucket" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : 11,
"doc_count" : 1
}
]
}
},
{
"key" : "room_Y",
"doc_count" : 1,
"hour_bucket" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : 10,
"doc_count" : 1
}
]
}
}
]
}
}
}
但我在“聚合”中的预期结果如下:
"aggregations" : {
"room_bucket" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "room_V",
"doc_count" : 1,
"hour_bucket" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : 11,
"doc_count" : 1
},
{
"key" : 12,
"doc_count" : 1
},
{
"key" : 13,
"doc_count" : 1
}
]
}
},
{
"key" : "room_Y",
"doc_count" : 1,
"hour_bucket" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : 10,
"doc_count" : 2
},
{
"key" : 11,
"doc_count" : 2
},
{
"key" : 12,
"doc_count" : 1
}
]
}
}
]
}
}
在当前结果中,它只读取了start_date。
然而,在预期的输出中,Room_V 应该有 "key" = 11, "key" = 12, "key" = 13 (doc_count应该是每把钥匙 1 个)因为根据 start_date 和 end_date,房间的使用时间为 11 - 13。
您可以通过利用 LongStream
并创建一个间隔内所有小时数的数组来实现您想要的效果,如下所示:
curl -XGET "https://localhost:9200/testindex/_search?pretty" -H 'Content-Type: application/json' -d'
{
"aggs": {
"room_bucket": {
"terms": {
"field": "room_name.keyword"
},
"aggs": {
"hour_bucket": {
"terms": {
"script": {
"inline": """
return LongStream.rangeClosed(doc.start_date.value, doc.end_date.value).toArray();
""",
"lang": "painless"
},
"order": {
"_key": "asc"
},
"value_type": "long"
}
}
}
}
}
}'