如何使用 ElasticSearch Painless 脚本查找与先前搜索结果匹配的记录

How to find records matching the result of a previous search using ElasticSearch Painless scripting

我在下面附上了索引。

索引中的每个文档都包含 Alice 或 Bob 的姓名和身高以及测量身高的年龄。在 10 岁时进行的测量被标记为 "baseline_height_at_age_10": true

我需要做以下事情:

  1. 求爱丽丝和鲍勃在 10 岁时的身高。
  2. Alice 和 Bob 的列表项 Return,身高低于他们 10 岁时身高的记录。

所以我的问题是:Painless 可以进行此类搜索吗? 如果你能给我指出一个很好的例子,我会很感激。

另外:ElasticSearch Painless 是解决这个问题的好方法吗?你能建议

索引映射

PUT /shlomi_test/
{
  "mappings": {
    "_doc": {
      "properties": {
        "first_name": {
          "type": "keyword",
          "fields": {
            "raw": {
              "type": "text"
            }
          }
        },
        "surname": {
          "type": "keyword",
          "fields": {
            "raw": {
              "type": "text"
            }
          }
        },
        "baseline_height_at_age_10": {
          "type": "boolean"
        },
        "age": {
          "type": "integer"
        },
        "height": {
          "type": "integer"
        }
      }
    }
  }
}

索引数据

POST /test/_doc/alice_green_8_110
{
  "first_name": "Alice",
  "surname": "Green",
  "age": 8,
  "height": 110,
  "baseline_height_at_age_10": false
}

POST /test/_doc/alice_green_10_120
{
  "first_name": "Alice",
  "surname": "Green",
  "age": 10,
  "height": 120,
  "baseline_height_at_age_10": true
}

POST /test/_doc/alice_green_13_140
{
  "first_name": "Alice",
  "surname": "Green",
  "age": 13,
  "height": 140,
  "baseline_height_at_age_10": false
}

POST /test/_doc/alice_green_23_170
{
  "first_name": "Alice",
  "surname": "Green",
  "age": 23,
  "height": 170,
  "baseline_height_at_age_10": false
}



POST /test/_doc/bob_green_8_120
{
  "first_name": "Alice",
  "surname": "Green",
  "age": 8,
  "height": 120,
  "baseline_height_at_age_10": false
}

POST /test/_doc/bob_green_10_130
{
  "first_name": "Alice",
  "surname": "Green",
  "age": 10,
  "height": 130,
  "baseline_height_at_age_10": true
}

POST /test/_doc/bob_green_15_160
{
  "first_name": "Alice",
  "surname": "Green",
  "age": 15,
  "height": 160,
  "baseline_height_at_age_10": false
}

POST /test/_doc/bob_green_21_180
{
  "first_name": "Alice",
  "surname": "Green",
  "age": 21,
  "height": 180,
  "baseline_height_at_age_10": false
}

您应该能够仅使用聚合来完成。假设人们只会越来越高,并且测量结果准确,您可以将查询限制为仅 10 岁或以下的文件,找到这些文件的最大高度,然后过滤那些结果以排除基线结果

POST test/_search
{
  "size": 0,
  "query": {
    "range": {
      "age": {
        "lte": 10
      }
    }
  },
  "aggs": {
    "names": {
      "terms": {
        "field": "first_name",
        "size": 10
      },
      "aggs": {
        "max_height": {
          "max": {
            "field": "height"
          }
        },
        "non-baseline": {
          "filter": {
            "match": {
              "baseline_height_at_age_10": false
            }
          },
          "aggs": {
            "top_hits": {
              "top_hits": {
                "size": 10
              }
            }
          }
        }
      }
    }
  }
}

我在 ElasticSearch 支持论坛上发布了同样的问题,重点是 Painless scripting How to find records matching the result of a previous search using ElasticSearch Painless scripting

答案是:

"I don't think the Painless approach will work here. You cannot use the results of one query to execute a second query with Painless.

The two-step approach that you outline at the end of your post is the way to go."

最重要的是,您不能将一个查询的结果用作另一个查询的输入。您可以过滤和聚合等等,但不是这个。

所以方法大致如下:

according to my understanding, suggests to do the 1st search, process the data and do an additional search. This basically translates to:

  1. 搜索 first_name=Alice 且 baseline_height_at_age_10=True 的记录。
  2. 外部处理,提取 Alice 在 10 岁时的身高值。
  3. 搜索Alice身高低于外部计算值的记录。