如何订购具有模糊性的完成建议器

Question

当使用 Completion Suggester with Fuzziness defined the ordering of results for suggestions are alphabetical instead of most relevant. It seems that whatever the fuzzines is set to is removed from the search/query term at the end of the term. This is not what I expected from reading the Completion Suggester Fuzziness docs 状态时：

Suggestions that share the longest prefix to the query prefix will be scored higher.

但事实并非如此。这是一个证明这一点的用例：

PUT test/
{
  "mappings":{
    "properties":{
      "id":{
        "type":"integer"
      },
      "title":{
        "type":"keyword",
        "fields": {
          "suggest": {
            "type": "completion"
          }
        }
      }
    }
  }
}

POST test/_bulk
{ "index" : {"_id": "1"}}
{ "title": "HOLARAT" }
{ "index" : {"_id": "2"}}
{ "title": "HOLBROOK" }
{ "index" : {"_id": "3"}}
{ "title": "HOLCONNEN" }
{ "index" : {"_id": "4"}}
{ "title": "HOLDEN" }
{ "index" : {"_id": "5"}}
{ "title": "HOLLAND" }

上面创建了一个索引并添加了一些数据。

如果对所述数据进行建议查询：

POST test/_search
{
  "_source": {
    "includes": [
      "title"
    ]
  },
  "suggest": {
    "title-suggestion": {
      "completion": {
        "fuzzy": {
          "fuzziness": "1"
        },
        "field": "title.suggest",
        "size": 3
      },
      "prefix": "HOLL"
    }
  }
}

它 returns 前 3 个结果按最后一个匹配字符的字母顺序排列，而不是最长的前缀（即 HOLLAND）：

{
  ...
  "suggest" : {
    "title-suggestion" : [
      {
        "text" : "HOLL",
        "offset" : 0,
        "length" : 4,
        "options" : [
          {
            "text" : "HOLARAT",
            "_index" : "test",
            "_type" : "_doc",
            "_id" : "1",
            "_score" : 3.0,
            "_source" : {
              "title" : "HOLARAT"
            }
          },
          {
            "text" : "HOLBROOK",
            "_index" : "test",
            "_type" : "_doc",
            "_id" : "2",
            "_score" : 3.0,
            "_source" : {
              "title" : "HOLBROOK"
            }
          },
          {
            "text" : "HOLCONNEN",
            "_index" : "test",
            "_type" : "_doc",
            "_id" : "3",
            "_score" : 3.0,
            "_source" : {
              "title" : "HOLCONNEN"
            }
          }
        ]
      }
    ]
  }
}

如果删除了大小参数，那么我们可以看到所有条目的分数都相同，而不是像规定的那样最长的前缀更高。

在这种情况下，定义了模糊性的 Completion Suggesters 的结果如何以最长的前缀排在最前面？

Answer 1

这是 reported in the past and this behavior is actually by design。

我通常在这种情况下会发送两个建议查询（类似于what has been suggested here），一个用于精确匹配，另一个用于模糊匹配。如果完全匹配包含建议，我就使用它，否则我求助于使用模糊的建议。

使用下面的建议查询，您将得到 HOLLAND 作为 exact-suggestion 然后模糊匹配 fuzzy-suggestion:

POST test/_search
{
  "_source": {
    "includes": [
      "title"
    ]
  },
  "suggest": {
    "fuzzy-suggestion": {
      "completion": {
        "fuzzy": {
          "fuzziness": "1"
        },
        "field": "title.suggest",
        "size": 3
      },
      "prefix": "HOLL"
    },
    "exact-suggestion": {
      "completion": {
        "field": "title.suggest",
        "size": 3
      },
      "prefix": "HOLL"
    }
  }
}

如何订购具有模糊性的完成建议器

How to Order Completion Suggester with Fuzziness

elasticsearch