模糊未按预期运行(单项搜索,请参见示例)

Fuzzy not functioning as expected (one term search, see example)

考虑以下结果:

curl -XGET 'http://localhost:9200/megacorp/employee/_search' -d 
'{ "query" : 
     {"match":  
        {"last_name": "Smith"}
     }
  }'

结果:

{
  "took": 3,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 0.30685282,
    "hits": [
      {
        "_index": "megacorp",
        "_type": "employee",
        "_id": "1",
        "_score": 0.30685282,
        "_source": {
          "first_name": "John",
          "last_name": "Smith",
          "age": 25,
          "about": "I love to go rock climbing on the weekends.",
          "interests": [
            "sports",
            "music"
          ]
        }
      },
      {
        "_index": "megacorp",
        "_type": "employee",
        "_id": "2",
        "_score": 0.30685282,
        "_source": {
          "first_name": "Jane",
          "last_name": "Smith",
          "age": 25,
          "about": "I love to go rock climbing",
          "interests": [
            "sports",
            "music"
          ]
        }
      }
    ]
  }
}

现在,当我执行以下查询时:

curl -XGET 'http://localhost:9200/megacorp/employee/_search' -d 
'{ "query" : 
        {"fuzzy": 
             {"last_name": 
                  {"value":"Smitt", 
                   "fuzziness": 1
                  }
              }
         }
 }'

Returns 尽管 "Smith" 和 "Smitt" 的 Levenshtein 距离为 1,但没有结果。同样的结果是 "Smit." 如果我输入 fuzziness值为2,我得到结果。我在这里错过了什么?

我假设您正在查询的 last_name 字段是经过分析的字符串。索引项将是 smith 而不是 Smith.

Returns NO results despite the Levenshtein distance of "Smith" and "Smitt" being 1.

fuzzy 查询不分析术语,所以实际上,你的 Levenshtein 距离不是 1 而是 2 :

  1. 斯密特 -> 史密斯
  2. 史密斯 -> 史密斯

尝试使用此映射,模糊度 = 1 的查询将起作用:

PUT /megacorp/employee/_mapping
{
  "employee":{
    "properties":{
      "last_name":{
        "type":"string",
        "index":"not_analyzed"
      }
    }
  }
}

希望对您有所帮助