elastic4s:如何将 analyzer/filter for german_phonebook 添加到分析中?

elastic4s: how to add analyzer/filter for german_phonebook to analysis?

如何使用 elastic4s 将以下 german_phonebook 分析器添加到弹性搜索中?

        "index": {
            "analysis": {
                "analyzer": {
                    "german": {
                        "filter": [
                            "lowercase",
                            "german_stop",
                            "german_normalization",
                            "german_stemmer"
                        ],
                        "tokenizer": "standard"
                    },
                    "german_phonebook": {
                        "filter": [
                            "german_phonebook"
                        ],
                        "tokenizer": "keyword"
                    },
                    "mySynonyms": {
                        "filter": [
                            "lowercase",
                            "mySynonymFilter"
                        ],
                        "tokenizer": "standard"
                    }
                },
                "filter": {
                    "german_phonebook": {
                        "country": "CH",
                        "language": "de",
                        "type": "icu_collation",
                        "variant": "@collation=phonebook"
                    },
                    "german_stemmer": {
                        "language": "light_german",
                        "type": "stemmer"
                    },
                    "german_stop": {
                        "stopwords": "_german",
                        "type": "stop"
                    },
                    "mySynonymFilter": {
                        "synonyms": [
                            "swisslift,lift"
                        ],
                        "type": "synonym"
                    }
                }
            },

这里的核心问题是 german_phonebook 过滤器类型 icu_collation?

使用哪个过滤器

...

根据答案我想出了这段代码:

  case class GPhonebook() extends TokenFilterDefinition {
    val filterType = "phonebook"
    def name = "german_phonebook"
    override def build(source: XContentBuilder): Unit = {
      source.field("tokenizer", "keyword")
      source.field("country", "CH")
      source.field("language", "de")
      source.field("type", "icu_collation")
      source.field("variant", "@collation=phonebook")  
    }
  }

分析器定义现在如下所示:

  CustomAnalyzerDefinition(
      "german_phonebook",
      KeywordTokenizer("myKeywordTokenizer2"),
      GPhonebook()
  )

你真正想要的是某种说法

CustomTokenFilter("german_phonebook)BuiltInTokenFilter("german_phonebook") 但你不能(我会添加)。

所以现在,您需要延长 TokenFilterDefinition

例如,类似

case class GPhonebook extends TokenFilterDefinition {
  val filterType = "phonebook"
  override def build(source: XContentBuilder): Unit = {
    // set extra params in here
  }
}