在 Elasticsearch 中的一个映射中使用不同的语言分析器和 ngram Analyzer
Using different language analyzers with ngram Analyzer in one mapping in Elasticsearch
我想将英语和德语自定义分析器与其他分析器(例如 ngram)一起使用。以下映射是否正确?我收到德国分析仪的错误。 [未知设置 [index.filter.german_stop.type]。我进行了搜索,但没有找到任何有关在自定义类型中使用多语言分析器的信息。是否可以使用特定语言的 ngram-filter?
PUT test {
"settings": {
"analysis": {
"analyzer": {
"english_analyzer": {
"type": "custom",
"filter": [
"lowercase",
"english_stop",
"ngram_filter_en"
],
"tokenizer": "whitespace"
}
},
"filter": {
"english_stop": {
"type": "stop"
},
"ngram_filter_en": {
"type": "edge_ngram",
"min_gram": 1,
"max_gram": 25
}
},
"german_analyzer" : {
"type" : "custom",
"filter" : [
"lowercase",
"german_stop",
"ngram_filter_de"
],
"tokenizer" : "whitespace"
}
},
"filter" : {
"german_stop" : {
"type" : "stop"
},
"ngram_filter_de" : {
"type" : "edge_ngram",
"min_ngram" : "1",
"max_gram" : 25
}
}
},
"mappings" : {
"dynamic" : true,
"properties": {
"content" : {
"tye" : "text",
"properties" : {
"en" : {
"type" : "text",
"analyzer" : "english_analyzer"
},
"de" : {
"type" : "text",
"analyzer" : "german_analyzer"
}
}
}
}
有小的语法错误。
- 您的最后一个过滤器对象在分析上下文之外。
- 您不能在一个 JSON 中多次使用相同的密钥。
因此,以下设置会有所帮助
{
"analysis": {
"analyzer": {
"english_analyzer": {
"type": "custom",
"filter": [
"lowercase",
"english_stop",
"ngram_filter_en"
],
"tokenizer": "whitespace"
}
},
"filter": {
"english_stop": {
"type": "stop"
},
"ngram_filter_en": {
"type": "edge_ngram",
"min_gram": 1,
"max_gram": 25
},
"german_stop": {
"type": "stop"
},
"ngram_filter_de": {
"type": "edge_ngram",
"min_ngram": "1",
"max_gram": 25
}
},
"german_analyzer": {
"type": "custom",
"filter": [
"lowercase",
"german_stop",
"ngram_filter_de"
],
"tokenizer": "whitespace"
}
}
}
了解映射中的错误
{
"analysis": {
"analyzer": {
"filter": {
"english_stop": {
"type": "stop"
},
"ngram_filter_en": {
"type": "edge_ngram",
"min_gram": 1,
"max_gram": 25
}
},
"german_analyzer" : {
"type" : "custom",
"filter" : [
"lowercase",
"german_stop",
"ngram_filter_de"
],
"tokenizer" : "whitespace"
}
},
"filter" : {//**This is outside analysis, you cannot simply add another filter key inside analysis, so you can merge both as above**
"german_stop" : {
"type" : "stop"
},
"ngram_filter_de" : {
"type" : "edge_ngram",
"min_ngram" : "1",
"max_gram" : 25
}
}
我想将英语和德语自定义分析器与其他分析器(例如 ngram)一起使用。以下映射是否正确?我收到德国分析仪的错误。 [未知设置 [index.filter.german_stop.type]。我进行了搜索,但没有找到任何有关在自定义类型中使用多语言分析器的信息。是否可以使用特定语言的 ngram-filter?
PUT test {
"settings": {
"analysis": {
"analyzer": {
"english_analyzer": {
"type": "custom",
"filter": [
"lowercase",
"english_stop",
"ngram_filter_en"
],
"tokenizer": "whitespace"
}
},
"filter": {
"english_stop": {
"type": "stop"
},
"ngram_filter_en": {
"type": "edge_ngram",
"min_gram": 1,
"max_gram": 25
}
},
"german_analyzer" : {
"type" : "custom",
"filter" : [
"lowercase",
"german_stop",
"ngram_filter_de"
],
"tokenizer" : "whitespace"
}
},
"filter" : {
"german_stop" : {
"type" : "stop"
},
"ngram_filter_de" : {
"type" : "edge_ngram",
"min_ngram" : "1",
"max_gram" : 25
}
}
},
"mappings" : {
"dynamic" : true,
"properties": {
"content" : {
"tye" : "text",
"properties" : {
"en" : {
"type" : "text",
"analyzer" : "english_analyzer"
},
"de" : {
"type" : "text",
"analyzer" : "german_analyzer"
}
}
}
}
有小的语法错误。
- 您的最后一个过滤器对象在分析上下文之外。
- 您不能在一个 JSON 中多次使用相同的密钥。
因此,以下设置会有所帮助
{
"analysis": {
"analyzer": {
"english_analyzer": {
"type": "custom",
"filter": [
"lowercase",
"english_stop",
"ngram_filter_en"
],
"tokenizer": "whitespace"
}
},
"filter": {
"english_stop": {
"type": "stop"
},
"ngram_filter_en": {
"type": "edge_ngram",
"min_gram": 1,
"max_gram": 25
},
"german_stop": {
"type": "stop"
},
"ngram_filter_de": {
"type": "edge_ngram",
"min_ngram": "1",
"max_gram": 25
}
},
"german_analyzer": {
"type": "custom",
"filter": [
"lowercase",
"german_stop",
"ngram_filter_de"
],
"tokenizer": "whitespace"
}
}
}
了解映射中的错误
{
"analysis": {
"analyzer": {
"filter": {
"english_stop": {
"type": "stop"
},
"ngram_filter_en": {
"type": "edge_ngram",
"min_gram": 1,
"max_gram": 25
}
},
"german_analyzer" : {
"type" : "custom",
"filter" : [
"lowercase",
"german_stop",
"ngram_filter_de"
],
"tokenizer" : "whitespace"
}
},
"filter" : {//**This is outside analysis, you cannot simply add another filter key inside analysis, so you can merge both as above**
"german_stop" : {
"type" : "stop"
},
"ngram_filter_de" : {
"type" : "edge_ngram",
"min_ngram" : "1",
"max_gram" : 25
}
}