Elasticsearch 嵌套函数评分和带函数评分的脚本评分
Elastic search nested function scores and script scoring with function score
我正在尝试根据字段的重要性实现自定义分数。
但是我需要比较不同文档类型的多个索引。这些文件有不同的领域,具有不同的重要性。
我需要这些结果的分数具有可比性,因此想忽略 TF/IDF 和分数标准化。
因此,如果搜索查询匹配 2 个重要字段和 1 个不太重要的字段,则它的分数应该是重要分数的两倍加上不太重要的分数:
(8* (1+1)) + (3*(1)) = 19
我得到的结果是 11。由于下面的查询似乎忽略了内部函数分数并计算:
(8*1) + (3*1).
分数解释也在下面,这似乎表明它忽略了内部 function_score 并且只给它一个恒定的分数 1(这是我想要停止发生的事情)。
我试过不嵌套函数分数并使用简单的应该查询以及尝试 boost_factor 而不是 'weight' 并给匹配的字段一个常量分数所有这些都有相同的结果。
此外,我不想使用常数权重乘以 script_score 来计算外部结果。但是,传递的“_score”不是我刚刚计算的分数,而是原始搜索分数。
除了 script_score 中的“_score”之外,我可以使用其他字段来获取此信息吗?
提前致谢!
查询
"query": {
"function_score": {
"functions": [
{
"weight": 8.0,
"filter": {
"fquery": {
"query": {
"function_score": {
"functions": [
{
"weight": 1.0,
"filter": {
"fquery": {
"query": {
"query_string": {
"query": "match*",
"fields": [
"ImportantField1"
],
"default_operator": "and",
"analyzer": "english",
"analyze_wildcard": true
}
}
}
}
},
{
"weight": 1.0,
"filter": {
"fquery": {
"query": {
"query_string": {
"query": "match*",
"fields": [
"ImportantField2"
],
"default_operator": "and",
"analyzer": "english",
"analyze_wildcard": true
}
}
}
} // More field queries that don't match omitted for clarity
}
],
"score_mode": "sum",
"boost_mode": "replace"
}
}
}
}
},
{
"weight": 3.0,
"filter": {
"fquery": {
"query": {
"function_score": {
"functions": [
{
"weight": 1.0,
"filter": {
"fquery": {
"query": {
"query_string": {
"query": "match*",
"fields": [
"LessImportantField"
],
"default_operator": "and",
"analyzer": "english",
"analyze_wildcard": true
}
}
}
}
}// More field queries that don't match omitted for clarity
],
"query": {
"match_all": {}
},
"score_mode": "sum",
"boost_mode": "replace"
}
}
}
}
}
],
"query": {
"match_all": {} // Filtering done here, omitted for clarity
}
},
"score_mode": "sum",
"boost_mode": "replace"
}
}
分数说明
"_explanation": {
"value": 11,
"description": "function score, product of:",
"details": [
{
"value": 11,
"description": "Math.min of",
"details": [
{
"value": 11,
"description": "function score, score mode [sum]",
"details": [
{
"value": 8,
"description": "function score, product of:",
"details": [
{
"value": 1,
"description": "match filter: QueryWrapperFilter(function score (ConstantScore(*:*), functions: [{filter(QueryWrapperFilter(ImportantField1:match*)), function [org.elasticsearch.common.lucene.search.function.WeightFactorFunction@64b3fd0e]}{filter(QueryWrapperFilter(ImportantField2:match*)), function [org.elasticsearch.common.lucene.search.function.WeightFactorFunction@38ed4b5c]}]))"
},
{
"value": 8,
"description": "product of:",
"details": [
{
"value": 1,
"description": "constant score 1.0 - no function provided"
},
{
"value": 8,
"description": "weight"
}
]
}
]
},
{
"value": 3,
"description": "function score, product of:",
"details": [
{
"value": 1,
"description": "match filter: QueryWrapperFilter(function score (ConstantScore(*:*), functions: [{filter(QueryWrapperFilter(LessImportantField:match*)), function [org.elasticsearch.common.lucene.search.function.WeightFactorFunction@3ce99ebf]}]))"
},
{
"value": 3,
"description": "product of:",
"details": [
{
"value": 1,
"description": "constant score 1.0 - no function provided"
},
{
"value": 3,
"description": "weight"
}
]
}
]
}
]
},
{
"value": 3.4028235e+38,
"description": "maxBoost"
}
]
},
{
"value": 1,
"description": "queryBoost"
}
]
}
所以这是不可能的。 Function_score 仅在其功能中使用过滤器来应用分数。这意味着它们要么匹配要么不匹配,因此无法传递嵌套 function_score 的分数。
我确实使用以下方法禁用了查询规范化:
"similarity": {
"default": {
"queryNorm": "1",
"type": //whatever type you want
}
}
然而,这意味着 TF/IDF 对我来说成了一个问题,因为这些值对于我的每个索引都是不同的,所以我最终使用编写自定义相似性 class 并将这些值设置为为常数 1.
我正在尝试根据字段的重要性实现自定义分数。
但是我需要比较不同文档类型的多个索引。这些文件有不同的领域,具有不同的重要性。 我需要这些结果的分数具有可比性,因此想忽略 TF/IDF 和分数标准化。
因此,如果搜索查询匹配 2 个重要字段和 1 个不太重要的字段,则它的分数应该是重要分数的两倍加上不太重要的分数:
(8* (1+1)) + (3*(1)) = 19
我得到的结果是 11。由于下面的查询似乎忽略了内部函数分数并计算:
(8*1) + (3*1).
分数解释也在下面,这似乎表明它忽略了内部 function_score 并且只给它一个恒定的分数 1(这是我想要停止发生的事情)。
我试过不嵌套函数分数并使用简单的应该查询以及尝试 boost_factor 而不是 'weight' 并给匹配的字段一个常量分数所有这些都有相同的结果。
此外,我不想使用常数权重乘以 script_score 来计算外部结果。但是,传递的“_score”不是我刚刚计算的分数,而是原始搜索分数。 除了 script_score 中的“_score”之外,我可以使用其他字段来获取此信息吗?
提前致谢!
查询
"query": {
"function_score": {
"functions": [
{
"weight": 8.0,
"filter": {
"fquery": {
"query": {
"function_score": {
"functions": [
{
"weight": 1.0,
"filter": {
"fquery": {
"query": {
"query_string": {
"query": "match*",
"fields": [
"ImportantField1"
],
"default_operator": "and",
"analyzer": "english",
"analyze_wildcard": true
}
}
}
}
},
{
"weight": 1.0,
"filter": {
"fquery": {
"query": {
"query_string": {
"query": "match*",
"fields": [
"ImportantField2"
],
"default_operator": "and",
"analyzer": "english",
"analyze_wildcard": true
}
}
}
} // More field queries that don't match omitted for clarity
}
],
"score_mode": "sum",
"boost_mode": "replace"
}
}
}
}
},
{
"weight": 3.0,
"filter": {
"fquery": {
"query": {
"function_score": {
"functions": [
{
"weight": 1.0,
"filter": {
"fquery": {
"query": {
"query_string": {
"query": "match*",
"fields": [
"LessImportantField"
],
"default_operator": "and",
"analyzer": "english",
"analyze_wildcard": true
}
}
}
}
}// More field queries that don't match omitted for clarity
],
"query": {
"match_all": {}
},
"score_mode": "sum",
"boost_mode": "replace"
}
}
}
}
}
],
"query": {
"match_all": {} // Filtering done here, omitted for clarity
}
},
"score_mode": "sum",
"boost_mode": "replace"
}
}
分数说明
"_explanation": {
"value": 11,
"description": "function score, product of:",
"details": [
{
"value": 11,
"description": "Math.min of",
"details": [
{
"value": 11,
"description": "function score, score mode [sum]",
"details": [
{
"value": 8,
"description": "function score, product of:",
"details": [
{
"value": 1,
"description": "match filter: QueryWrapperFilter(function score (ConstantScore(*:*), functions: [{filter(QueryWrapperFilter(ImportantField1:match*)), function [org.elasticsearch.common.lucene.search.function.WeightFactorFunction@64b3fd0e]}{filter(QueryWrapperFilter(ImportantField2:match*)), function [org.elasticsearch.common.lucene.search.function.WeightFactorFunction@38ed4b5c]}]))"
},
{
"value": 8,
"description": "product of:",
"details": [
{
"value": 1,
"description": "constant score 1.0 - no function provided"
},
{
"value": 8,
"description": "weight"
}
]
}
]
},
{
"value": 3,
"description": "function score, product of:",
"details": [
{
"value": 1,
"description": "match filter: QueryWrapperFilter(function score (ConstantScore(*:*), functions: [{filter(QueryWrapperFilter(LessImportantField:match*)), function [org.elasticsearch.common.lucene.search.function.WeightFactorFunction@3ce99ebf]}]))"
},
{
"value": 3,
"description": "product of:",
"details": [
{
"value": 1,
"description": "constant score 1.0 - no function provided"
},
{
"value": 3,
"description": "weight"
}
]
}
]
}
]
},
{
"value": 3.4028235e+38,
"description": "maxBoost"
}
]
},
{
"value": 1,
"description": "queryBoost"
}
]
}
所以这是不可能的。 Function_score 仅在其功能中使用过滤器来应用分数。这意味着它们要么匹配要么不匹配,因此无法传递嵌套 function_score 的分数。
我确实使用以下方法禁用了查询规范化:
"similarity": {
"default": {
"queryNorm": "1",
"type": //whatever type you want
}
}
然而,这意味着 TF/IDF 对我来说成了一个问题,因为这些值对于我的每个索引都是不同的,所以我最终使用编写自定义相似性 class 并将这些值设置为为常数 1.