如何规范化弹性搜索查询中的周期（例如 JJ Abrams == J.J Abrams）？

Question

我需要它，以便其中带有句点的单词等于非句点变体。

我看到文档中有一个关于分析器和令牌过滤器的部分，但我发现它相当简洁，但我不确定如何去做。

Answer 1

使用char filter来消除点，例如：

PUT /no_dots
{
  "settings": {
    "analysis": {
      "char_filter": {
        "my_mapping": {
          "type": "mapping",
          "mappings": [
            ".=>"
          ]
        }
      },
      "analyzer": {
        "my_no_dots_analyzer": {
          "tokenizer": "standard",
          "char_filter": [
            "my_mapping"
          ]
        }
      }
    }
  },
  "mappings": {
    "test": {
      "properties": {
        "text": {
          "type": "string",
          "analyzer": "my_no_dots_analyzer"
        }
      }
    }
  }
}

并对其进行测试 GET /no_dots/_analyze?analyzer=my_no_dots_analyzer&text=J.J Abrams returns:

{
   "tokens": [
      {
         "token": "JJ",
         "start_offset": 0,
         "end_offset": 3,
         "type": "<ALPHANUM>",
         "position": 1
      },
      {
         "token": "Abrams",
         "start_offset": 4,
         "end_offset": 10,
         "type": "<ALPHANUM>",
         "position": 2
      }
   ]
}

如何规范化弹性搜索查询中的周期（例如 JJ Abrams == J.J Abrams）？

How to normalize periods in elastic search query (such that JJ Abrams == J.J Abrams)?

elasticsearch