如何使用 ElasticSearch Painless 脚本查找与先前搜索结果匹配的记录
How to find records matching the result of a previous search using ElasticSearch Painless scripting
我在下面附上了索引。
索引中的每个文档都包含 Alice 或 Bob 的姓名和身高以及测量身高的年龄。在 10 岁时进行的测量被标记为 "baseline_height_at_age_10": true
我需要做以下事情:
- 求爱丽丝和鲍勃在 10 岁时的身高。
- Alice 和 Bob 的列表项 Return,身高低于他们 10 岁时身高的记录。
所以我的问题是:Painless 可以进行此类搜索吗?
如果你能给我指出一个很好的例子,我会很感激。
另外:ElasticSearch Painless 是解决这个问题的好方法吗?你能建议
索引映射
PUT /shlomi_test/
{
"mappings": {
"_doc": {
"properties": {
"first_name": {
"type": "keyword",
"fields": {
"raw": {
"type": "text"
}
}
},
"surname": {
"type": "keyword",
"fields": {
"raw": {
"type": "text"
}
}
},
"baseline_height_at_age_10": {
"type": "boolean"
},
"age": {
"type": "integer"
},
"height": {
"type": "integer"
}
}
}
}
}
索引数据
POST /test/_doc/alice_green_8_110
{
"first_name": "Alice",
"surname": "Green",
"age": 8,
"height": 110,
"baseline_height_at_age_10": false
}
POST /test/_doc/alice_green_10_120
{
"first_name": "Alice",
"surname": "Green",
"age": 10,
"height": 120,
"baseline_height_at_age_10": true
}
POST /test/_doc/alice_green_13_140
{
"first_name": "Alice",
"surname": "Green",
"age": 13,
"height": 140,
"baseline_height_at_age_10": false
}
POST /test/_doc/alice_green_23_170
{
"first_name": "Alice",
"surname": "Green",
"age": 23,
"height": 170,
"baseline_height_at_age_10": false
}
POST /test/_doc/bob_green_8_120
{
"first_name": "Alice",
"surname": "Green",
"age": 8,
"height": 120,
"baseline_height_at_age_10": false
}
POST /test/_doc/bob_green_10_130
{
"first_name": "Alice",
"surname": "Green",
"age": 10,
"height": 130,
"baseline_height_at_age_10": true
}
POST /test/_doc/bob_green_15_160
{
"first_name": "Alice",
"surname": "Green",
"age": 15,
"height": 160,
"baseline_height_at_age_10": false
}
POST /test/_doc/bob_green_21_180
{
"first_name": "Alice",
"surname": "Green",
"age": 21,
"height": 180,
"baseline_height_at_age_10": false
}
您应该能够仅使用聚合来完成。假设人们只会越来越高,并且测量结果准确,您可以将查询限制为仅 10 岁或以下的文件,找到这些文件的最大高度,然后过滤那些结果以排除基线结果
POST test/_search
{
"size": 0,
"query": {
"range": {
"age": {
"lte": 10
}
}
},
"aggs": {
"names": {
"terms": {
"field": "first_name",
"size": 10
},
"aggs": {
"max_height": {
"max": {
"field": "height"
}
},
"non-baseline": {
"filter": {
"match": {
"baseline_height_at_age_10": false
}
},
"aggs": {
"top_hits": {
"top_hits": {
"size": 10
}
}
}
}
}
}
}
}
我在 ElasticSearch 支持论坛上发布了同样的问题,重点是 Painless scripting How to find records matching the result of a previous search using ElasticSearch Painless scripting
答案是:
"I don't think the Painless approach will work here. You cannot use
the results of one query to execute a second query with Painless.
The two-step approach that you outline at the end of your post is the
way to go."
最重要的是,您不能将一个查询的结果用作另一个查询的输入。您可以过滤和聚合等等,但不是这个。
所以方法大致如下:
according to my understanding, suggests to do the 1st search, process
the data and do an additional search. This basically translates to:
- 搜索 first_name=Alice 且 baseline_height_at_age_10=True 的记录。
- 外部处理,提取 Alice 在 10 岁时的身高值。
- 搜索Alice身高低于外部计算值的记录。
我在下面附上了索引。
索引中的每个文档都包含 Alice 或 Bob 的姓名和身高以及测量身高的年龄。在 10 岁时进行的测量被标记为 "baseline_height_at_age_10": true
我需要做以下事情:
- 求爱丽丝和鲍勃在 10 岁时的身高。
- Alice 和 Bob 的列表项 Return,身高低于他们 10 岁时身高的记录。
所以我的问题是:Painless 可以进行此类搜索吗? 如果你能给我指出一个很好的例子,我会很感激。
另外:ElasticSearch Painless 是解决这个问题的好方法吗?你能建议
索引映射
PUT /shlomi_test/
{
"mappings": {
"_doc": {
"properties": {
"first_name": {
"type": "keyword",
"fields": {
"raw": {
"type": "text"
}
}
},
"surname": {
"type": "keyword",
"fields": {
"raw": {
"type": "text"
}
}
},
"baseline_height_at_age_10": {
"type": "boolean"
},
"age": {
"type": "integer"
},
"height": {
"type": "integer"
}
}
}
}
}
索引数据
POST /test/_doc/alice_green_8_110
{
"first_name": "Alice",
"surname": "Green",
"age": 8,
"height": 110,
"baseline_height_at_age_10": false
}
POST /test/_doc/alice_green_10_120
{
"first_name": "Alice",
"surname": "Green",
"age": 10,
"height": 120,
"baseline_height_at_age_10": true
}
POST /test/_doc/alice_green_13_140
{
"first_name": "Alice",
"surname": "Green",
"age": 13,
"height": 140,
"baseline_height_at_age_10": false
}
POST /test/_doc/alice_green_23_170
{
"first_name": "Alice",
"surname": "Green",
"age": 23,
"height": 170,
"baseline_height_at_age_10": false
}
POST /test/_doc/bob_green_8_120
{
"first_name": "Alice",
"surname": "Green",
"age": 8,
"height": 120,
"baseline_height_at_age_10": false
}
POST /test/_doc/bob_green_10_130
{
"first_name": "Alice",
"surname": "Green",
"age": 10,
"height": 130,
"baseline_height_at_age_10": true
}
POST /test/_doc/bob_green_15_160
{
"first_name": "Alice",
"surname": "Green",
"age": 15,
"height": 160,
"baseline_height_at_age_10": false
}
POST /test/_doc/bob_green_21_180
{
"first_name": "Alice",
"surname": "Green",
"age": 21,
"height": 180,
"baseline_height_at_age_10": false
}
您应该能够仅使用聚合来完成。假设人们只会越来越高,并且测量结果准确,您可以将查询限制为仅 10 岁或以下的文件,找到这些文件的最大高度,然后过滤那些结果以排除基线结果
POST test/_search
{
"size": 0,
"query": {
"range": {
"age": {
"lte": 10
}
}
},
"aggs": {
"names": {
"terms": {
"field": "first_name",
"size": 10
},
"aggs": {
"max_height": {
"max": {
"field": "height"
}
},
"non-baseline": {
"filter": {
"match": {
"baseline_height_at_age_10": false
}
},
"aggs": {
"top_hits": {
"top_hits": {
"size": 10
}
}
}
}
}
}
}
}
我在 ElasticSearch 支持论坛上发布了同样的问题,重点是 Painless scripting How to find records matching the result of a previous search using ElasticSearch Painless scripting
答案是:
"I don't think the Painless approach will work here. You cannot use the results of one query to execute a second query with Painless.
The two-step approach that you outline at the end of your post is the way to go."
最重要的是,您不能将一个查询的结果用作另一个查询的输入。您可以过滤和聚合等等,但不是这个。
所以方法大致如下:
according to my understanding, suggests to do the 1st search, process the data and do an additional search. This basically translates to:
- 搜索 first_name=Alice 且 baseline_height_at_age_10=True 的记录。
- 外部处理,提取 Alice 在 10 岁时的身高值。
- 搜索Alice身高低于外部计算值的记录。