ElasticSearch

Question

假设我的 elasticsearch 中有三个文档。例如：

1: {
    "name": "test_2602"
   }
2: {
    "name": "test-2602"
   }
3: {
    "name": "test 2602"
   }

现在，当我使用下面给出的模糊匹配查询进行搜索时

{
  "query": {
    "bool": {
      "must": [
        {
          "bool": {
            "must": [
              {
                "match": {
                  "name": {
                    "query": "test-2602",
                    "fuzziness": "2",
                    "prefix_length": 0,
                    "max_expansions": 50,
                    "fuzzy_transpositions": true,
                    "lenient": false,
                    "zero_terms_query": "NONE",
                    "boost": 1
                  }
                }
              }
            ],
            "disable_coord": false,
            "adjust_pure_negative": true,
            "boost": 1
          }
        }
      ],
      "disable_coord": false,
      "adjust_pure_negative": true,
      "boost": 1
    }
  }
}

作为回应，我只得到两个文档（即使我按名称值搜索 =>“test”、“test 2602”或“test-2602”）

  {
    "name": "test-2602"
  },
  {
    "name": "test 2602"
  }

我没有得到名称为“test_2602”的文档（与包含下划线的值不匹配）。我希望它包括第三个文档以及名称值为“test_2602”。但是，如果我将名称搜索为“test_2602”，那么作为响应，我会得到

 {
   "name": "test_2602"
 }

每当我搜索名称为“test”、“test 2602”、“test-2602”和“test_2602”时，我都需要获取所有三个文档

Answer 1

您在搜索中只得到两个文档，因为默认情况下 elasticsearch 使用 standard analyzer，它将 "test-2602" 和 "test 2602" 标记为 test 和 2602.但是 "test_2602" 不会被标记化。

您可以查看使用analyze API

生成的令牌

GET /_analyze

{
  "analyzer" : "standard",
  "text" : "test_2602"
}

生成的令牌将是

{
  "tokens": [
    {
      "token": "test_2602",
      "start_offset": 0,
      "end_offset": 9,
      "type": "<ALPHANUM>",
      "position": 0
    }
  ]
}

您需要在类型字段中添加.keyword。这使用关键字分析器而不是标准分析器（注意名称字段后的“.keyword”）。试试下面的查询 -

索引映射：

{
  "mappings": {
    "properties": {
      "name": {
        "type": "text",
        "fields": {
          "keyword": {
            "type": "keyword"
          }
        }
      }
    }
  }
}

搜索查询：

{
  "query": {
    "match": {
      "name.keyword": {
        "query": "test_2602",
        "fuzziness":2
      }
    }
  }
}

搜索结果：

"hits": [
      {
        "_index": "66572330",
        "_type": "_doc",
        "_id": "1",
        "_score": 0.9808291,
        "_source": {
          "name": "test_2602"
        }
      },
      {
        "_index": "66572330",
        "_type": "_doc",
        "_id": "3",
        "_score": 0.8718481,
        "_source": {
          "name": "test 2602"
        }
      },
      {
        "_index": "66572330",
        "_type": "_doc",
        "_id": "2",
        "_score": 0.8718481,
        "_source": {
          "name": "test-2602"
        }
      }
    ]

ElasticSearch - 无法使用模糊匹配查询搜索值中的下划线（ES 模糊不匹配下划线值）

ElasticSearch - Unable To Search Using Fuzzy Match Query For Underscore in value (ES Fuzzy not matching underscore value)

fuzzy