基于数组/嵌套内最大分数的 Elasticsearch 函数评分

Question

我的文档中有一个字段，它存储一个整数数组。

Java Class:

public class Clazz {
    public List<Foo> foo;

    public static Foo {
         public Integer bar;
         public Integer baz;
    }
}

映射：

"properties" : {
    "foo" : {
        "properties" : {
          "bar" : {
            "type" : "integer"
          },
          "baz" : {
            "type" : "integer"
          }
        }
    }
}

示例文档：

{
    id: 1
    foo: [
        { bar: 10 }, 
        { bar: 20 }
    ]
},

{
    id: 2
    foo: [
        { bar: 15 }
    ]
}

现在我想做我的评分。评分函数被赋予 input 值：10.

评分函数基本上是："The closer foo.bar is to input, the higher the score. And if foo.bar is lower than input the score is only half as good"

查询：

"function_score" : {
    "functions" : [ {
        "script_score" : {
            "script" : "if(doc['foo.bar'].value >= input) { (input - doc['foo.bar'].value) * 1 } else { (doc['foo.bar'].value - input) * 2 }",
            "lang" : "groovy",
            "params" : {
                "input" : 10
            }
      }
} ],
"score_mode" : "max",
"boost_mode" : "replace"

}

预期结果：

id 1 应该是第一个，因为有一个 foo.bar 匹配 input=10.

发生了什么：

如果文档只有一个 foo.bar 值，则评分非常有效。如果它是一个数组（比如在带有 id 1 的文档中），Elasticsearch 似乎采用数组中的最后一个值。

查询应该做什么：

取最好的分数。这就是我使用 score_mode: max 的原因。但似乎，这只考虑了 function_score 中的 functions 数组，而不是（正如我所期望的那样）函数中的可能分数。

我在某处读到关于使用 doc['foo.bar'].values（值s 而不是值），但我不知道在这种情况下如何使用它。

你有什么想法吗？如何让它发挥作用？

Answer 1

使用 groovy 实现此目的的一种方法如下，即您可以使用 list on values 的 max 方法。

示例：

{
   "query": {
      "function_score": {
         "functions": [
            {
               "script_score": {
                  "script": "max_score=doc[\"foo.bar\"].values.max();if(max_score >= input) {return (max_score - input);} else { return (max_score - input) *2;}",
                  "lang": "groovy",
                  "params": {
                     "input": 10
                  }
               }
            }
         ],
         "score_mode": "max",
         "boost_mode": "replace"
      }
   }
}

基于数组/嵌套内最大分数的 Elasticsearch 函数评分

Elasticsearch Function Scoring based on max score within array / nested

scoring

elasticsearch