非英语键盘的 Elasticsearch 符号同义词

Elasticsearch symbol synonym for non English keyboardss

我正在为一个 Elasticsearch 的网站编制索引,该网站的许多名称都带有斯堪的纳维亚字符。问题是我们的用户通常使用美式英语键盘并将这些字符替换为最接近的英文字母。例如,索引的是 Tromsø 但搜索的是 Tromso.

Elasticsearch中搜索时,如何添加字符同义词使原始字符和英文字符相等?

您可以创建自定义分析器并像这样设置 char filter

PUT my_index
{
    "settings": {
        "analysis": {
            "analyzer": {
                "my_analyzer": {
                    "tokenizer": "standard",
                    "char_filter": [
                        "my_char_filter"
                    ]
                }
            },
            "char_filter": {
                "my_char_filter": {
                    "type": "mapping",
                    "mappings": [
                        "ø => o",
                        "á => a"
                    ]
                }
            },
            "filter": [
                "lowercase"
            ]
        }
    }
}

在这种情况下,TromsøTromso 将给出相同的输出项。查看有关创建自定义分析器的主题 https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-custom-analyzer.html