按 'prefix first' 逻辑对弹性命中进行排序

Sorting elastic hits on 'prefix first' logic

我想实现一个排序的结果集,其中在自动建议中开始搜索词的词出现在顶部,然后是 'contain' 它在文本中的词:例如: 搜索词:倡导者 结果:

提倡 x
提倡Yx
一些拥护者

我的结果集 howvere 为包含该术语的结果给出了比 'begin with' it.How 更高的分数,我要不要修正这个:

映射,js:

{
  "settings": {
    "index": {
      "max_ngram_diff": 39
    },
    "analysis": {
      "normalizer": {
        "custom_normalizer": {
          "type": "custom",
          "char_filter": [],
          "filter": [
            "lowercase",
            "asciifolding"
          ]
        }
      },
      "analyzer": {
        "custom_analyzer": {
          "tokenizer": "custom_tokenizer",
          "filter": [
            "lowercase"
          ]
        },
        "autocomplete_search": {
          "type": "custom",
          "tokenizer": "keyword",
          "filter": "lowercase"
        }
      },
      "tokenizer": {
        "custom_tokenizer": {
          "type": "ngram",
          "min_gram": 1,
          "max_gram": 40,
          "token_chars": [
            "letter",
            "digit",
            "whitespace",
            "punctuation",
            "symbol"
          ]
        }
      }
    }
  },
  "mappings": {
    "relations": {  
      "properties": {
      "primaryTerm": {
        "type": "text",
        "analyzer": "custom_analyzer",
        "search_analyzer": "autocomplete_search",
        "fielddata": "true",
        "fields": {
          "raw": {
            "type": "keyword",
            "normalizer": "custom_normalizer"
          }
        }
      },
      "entityType": {
        "type": "keyword",
        "normalizer": "custom_normalizer"
      },
      "variants": {
        "type": "text",
        "analyzer": "custom_analyzer",
        "search_analyzer": "autocomplete_search",
        "fielddata": "true",
        "fields": {
          "raw": {
            "type": "keyword",
            "normalizer": "custom_normalizer"
          }
          }
        }
      }
    }
  }
}

搜索查询:

String query="{"bool": { "should": [ {"query_string": {"query":"advocate","fields": ["primaryTerm" ]}},{"query_string": {"query":"advocate","fields": ["primaryTerm.raw^2" ] } } ]}}";
结果: 其他:

弹性结果:

{"total":1,"successful":1,"skipped":0,"failed":0},"hits":{"total":12,"max_score":6.094379,"hits":[{"_index":"agencyvars","_type":"relations","_id":"qCeqHHgBcFeeTWhjAoua","_score":6.094379,"_source":{"entityType":"Agency","primaryTerm":"ACT ADVOCATES","variants":[]}},{"_index":"agencyvars","_type":"relations","_id":"OyeqHHgBcFeeTWhjJYxu","_score":5.6339674,"_source":{"entityType":"Agency","primaryTerm":"TALWAR ADVOCATES","variants":["TALWAR & ADVOCATES"]}},{"_index":"agencyvars","_type":"relations","_id":"BSeqHHgBcFeeTWhjGIyJ","_score":5.1183944,"_source":{"entityType":"Agency","primaryTerm":"ZEUSIP ADVOCATES LLP","variants":["ZEUS IP, ADVOCATES","ZEUSIP ADVOCATES","ZEUS IP ADVOCATES","ZEUS IP","ZEUSIPADVOCATES LLP","ZIUSIP ADVOCATES"]}},{"_index":"agencyvars","_type":"relations","_id":"3CeqHHgBcFeeTWhjTYyZ","_score":4.6892724,"_source":{"entityType":"Agency","primaryTerm":"MURTI & MURTI ADVOCATES","variants":[]}},{"_index":"agencyvars","_type":"relations","_id":"0SeqHHgBcFeeTWhjjI18","_score":4.4118576,"_source":{"entityType":"Agency","primaryTerm":"ANAND AND ANAND ADVOCATES","variants":["AANAND & ANAND ADVOCATES","NAND AND ANAND ADVOCATES","ANAND & ANAND, ADVOCATES","ANAND & ANAND ADVOCATES","ANAND & ANAND,ADVOCATES","ANAND & ANAND","ANAND&ANAND","ANAND AND ANAND ADVOCAETES","ANAND AND ANAND ADVOCATE","ANAND AND ANANDADVOCATES","AND ANAND ADVOCATES","ANAND & ANAND ADVOCATES.","ANAND AND ANAN","ANAND AND ANAND","ANAND AND ANAND ADVOCATES,","ANAND AND ANAND ADVOCATES.","ANAND AND ANAND , ADVOCATES","ANAND AND"]}},{"_index":"agencyvars","_type":"relations","_id":"2CeqHHgBcFeeTWhjTIyn","_score":3.2560868,"_source":{"entityType":"Agency","primaryTerm":"STAR IP Advocates and IPR Attorneys","variants":["STARIP, ADVOCATES & IP ATTORNEYS"]}},{"_index":"agencyvars","_type":"relations","_id":"3yeqHHgBcFeeTWhjD4uW","_score":2.521993,"_source":{"entityType":"Agency","primaryTerm":"ADVOCATE AND PATENTS & TRADE MARKS ATTORNEY","variants":[]}}]}}#######3

总之分数是:

score":5.6339674,"_source":{"primaryTerm":"TALWAR ADVOCATES"}

_score":5.1183944,"_source":{"primaryTerm":"INTELLEXIP ADVOCATES}

score":2.521993,"_source":{"primaryTerm":"ADVOCATE AND PATENTS & TRADE MARKS ATTORNEY}

PS:由于我是 elastic[=16= 的新手,所以对答案的一个小解释将不胜感激]

要应用前缀优先逻辑,您可以使用 prefix queryboost 参数。试试下面的查询

{
  "query": {
    "bool": {
      "should": [
        {
          "query_string": {
            "query": "advocate",
            "fields": [
              "primaryTerm"
            ]
          }
        },
        {
          "prefix": {
            "primaryTerm.raw": {
              "value": "advocate",
              "boost": 2
            }
          }
        }
      ]
    }
  }
}

搜索结果将是

"hits": [
      {
        "_index": "67049029",
        "_type": "_doc",
        "_id": "1",
        "_score": 2.0386105,
        "_source": {
          "primaryTerm": "ADVOCATE AND PATENTS & TRADE MARKS ATTORNEY"
        }
      },
      {
        "_index": "67049029",
        "_type": "_doc",
        "_id": "3",
        "_score": 0.08597656,
        "_source": {
          "primaryTerm": "TALWAR ADVOCATES"
        }
      },
      {
        "_index": "67049029",
        "_type": "_doc",
        "_id": "2",
        "_score": 0.07815027,
        "_source": {
          "primaryTerm": "INTELLEXIP ADVOCATES"
        }
      }
    ]

更新 1:

boost 2 在您的案例中不起作用,因为 TALWAR ADVOCATES" 的分数是 5.6339674,"ADVOCATE AND PATENTS & TRADE MARKS ATTORNEY" 的分数是 2.521993。

2.521993 乘以 2,得到 5.043986。由于 5.043986 < 5.6339674,您没有得到预期的搜索结果。因此,boost 10 对你有用。但是,任何大于 2 的提升值都适用。