Elasticsearch 6.4 在创建自定义字符过滤器时抛出错误
Elastic Search 6.4 throwing Error upon creating custom character filter
所以我很确定我在语法中遗漏了一些东西,但我似乎无法弄清楚到底是什么。我正在尝试创建 phone 数字模式捕获令牌过滤器定义 here。它说要定义关键字过滤器,然后在顶部应用模式捕获令牌。这就是我所做的:
{
"mappings": {
"_doc": {
"properties": {
"phone": {
"type": "text",
"analyzer": "my_phone_analyzer"
}
}
}
},
"settings": {
"analysis": {
"analyzer": {
"my_phone_analyzer": {
"type": "custom",
"tokenizer": "keyword",
"char_filter": [
"phone_number"
]
}
}
},
"char_filter": {
"phone_number": {
"type": "pattern_capture",
"preserve_original": 1,
"patterns": [
"1(\d{3}(\d+))"
]
}
}
}
}
导致以下错误的原因:
{
"error": {
"root_cause": [
{
"type": "illegal_argument_exception",
"reason": "unknown setting [index.char_filter.phone_number.patterns] please check that any required plugins are installed, or check the breaking changes documentation for removed settings"
}
],
"type": "illegal_argument_exception",
"reason": "unknown setting [index.char_filter.phone_number.patterns] please check that any required plugins are installed, or check the breaking changes documentation for removed settings",
"suppressed": [
{
"type": "illegal_argument_exception",
"reason": "unknown setting [index.char_filter.phone_number.preserve_original] please check that any required plugins are installed, or check the breaking changes documentation for removed settings"
},
{
"type": "illegal_argument_exception",
"reason": "unknown setting [index.char_filter.phone_number.type] please check that any required plugins are installed, or check the breaking changes documentation for removed settings"
}
]
},
"status": 400
}
如果有人能指出我做错了什么,那就太好了!
您创建 my_phone_analyzer
的配置存在一些问题。
pattern_capture
允许在令牌过滤器中使用,而不是在字符过滤器中使用,请在此处阅读更多内容 https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-pattern-capture-tokenfilter.html
preserve_original
设置不采用 1
值,而是使用 true
、false
作为值。
因此,考虑到所有这些因素,我能够使用与您相同的设置创建 my_phone_analyzer
。
{
"settings" : {
"analysis" : {
"filter" : {
"code" : {
"type" : "pattern_capture",
"preserve_original" : true,
"patterns": [
"1(\d{3}(\d+))"
]
}
},
"analyzer" : {
"code" : {
"tokenizer" : "keyword",
"filter" : [ "code", "lowercase" ]
}
}
}
}
}
如果您遇到任何问题,请告诉我。
你说的link看起来很旧
pattern_capture
不再适用于 char_filter
but only on token filter
如果您在上面使用 Elasticsearch,下面是您的映射方式 5.x
PUT <your_index_name>
{
"mappings":{
"_doc":{
"properties":{
"phone":{
"type":"text",
"analyzer":"my_phone_analyzer"
}
}
}
},
"settings":{
"analysis":{
"analyzer":{
"my_phone_analyzer":{
"type":"custom",
"tokenizer":"keyword",
"filter":[
"phone_number"
]
}
},
"filter":{
"phone_number":{
"type":"pattern_capture",
"preserve_original":true,
"patterns":[
"1(\d{3}(\d+))"
]
}
}
}
}
}
您可以使用 Analyze API
来查看生成了哪些令牌,如下所述:
POST <your_index_name>/_analyze
{
"analyzer": "my_phone_analyzer",
"text": "19195557321"
}
代币:
{
"tokens" : [
{
"token" : "19195557321",
"start_offset" : 0,
"end_offset" : 11,
"type" : "word",
"position" : 0
},
{
"token" : "9195557321",
"start_offset" : 0,
"end_offset" : 11,
"type" : "word",
"position" : 0
},
{
"token" : "5557321",
"start_offset" : 0,
"end_offset" : 11,
"type" : "word",
"position" : 0
}
]
}
希望对您有所帮助!
所以我很确定我在语法中遗漏了一些东西,但我似乎无法弄清楚到底是什么。我正在尝试创建 phone 数字模式捕获令牌过滤器定义 here。它说要定义关键字过滤器,然后在顶部应用模式捕获令牌。这就是我所做的:
{
"mappings": {
"_doc": {
"properties": {
"phone": {
"type": "text",
"analyzer": "my_phone_analyzer"
}
}
}
},
"settings": {
"analysis": {
"analyzer": {
"my_phone_analyzer": {
"type": "custom",
"tokenizer": "keyword",
"char_filter": [
"phone_number"
]
}
}
},
"char_filter": {
"phone_number": {
"type": "pattern_capture",
"preserve_original": 1,
"patterns": [
"1(\d{3}(\d+))"
]
}
}
}
}
导致以下错误的原因:
{
"error": {
"root_cause": [
{
"type": "illegal_argument_exception",
"reason": "unknown setting [index.char_filter.phone_number.patterns] please check that any required plugins are installed, or check the breaking changes documentation for removed settings"
}
],
"type": "illegal_argument_exception",
"reason": "unknown setting [index.char_filter.phone_number.patterns] please check that any required plugins are installed, or check the breaking changes documentation for removed settings",
"suppressed": [
{
"type": "illegal_argument_exception",
"reason": "unknown setting [index.char_filter.phone_number.preserve_original] please check that any required plugins are installed, or check the breaking changes documentation for removed settings"
},
{
"type": "illegal_argument_exception",
"reason": "unknown setting [index.char_filter.phone_number.type] please check that any required plugins are installed, or check the breaking changes documentation for removed settings"
}
]
},
"status": 400
}
如果有人能指出我做错了什么,那就太好了!
您创建 my_phone_analyzer
的配置存在一些问题。
pattern_capture
允许在令牌过滤器中使用,而不是在字符过滤器中使用,请在此处阅读更多内容 https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-pattern-capture-tokenfilter.htmlpreserve_original
设置不采用1
值,而是使用true
、false
作为值。
因此,考虑到所有这些因素,我能够使用与您相同的设置创建 my_phone_analyzer
。
{
"settings" : {
"analysis" : {
"filter" : {
"code" : {
"type" : "pattern_capture",
"preserve_original" : true,
"patterns": [
"1(\d{3}(\d+))"
]
}
},
"analyzer" : {
"code" : {
"tokenizer" : "keyword",
"filter" : [ "code", "lowercase" ]
}
}
}
}
}
如果您遇到任何问题,请告诉我。
你说的link看起来很旧
pattern_capture
不再适用于 char_filter
but only on token filter
如果您在上面使用 Elasticsearch,下面是您的映射方式 5.x
PUT <your_index_name>
{
"mappings":{
"_doc":{
"properties":{
"phone":{
"type":"text",
"analyzer":"my_phone_analyzer"
}
}
}
},
"settings":{
"analysis":{
"analyzer":{
"my_phone_analyzer":{
"type":"custom",
"tokenizer":"keyword",
"filter":[
"phone_number"
]
}
},
"filter":{
"phone_number":{
"type":"pattern_capture",
"preserve_original":true,
"patterns":[
"1(\d{3}(\d+))"
]
}
}
}
}
}
您可以使用 Analyze API
来查看生成了哪些令牌,如下所述:
POST <your_index_name>/_analyze
{
"analyzer": "my_phone_analyzer",
"text": "19195557321"
}
代币:
{
"tokens" : [
{
"token" : "19195557321",
"start_offset" : 0,
"end_offset" : 11,
"type" : "word",
"position" : 0
},
{
"token" : "9195557321",
"start_offset" : 0,
"end_offset" : 11,
"type" : "word",
"position" : 0
},
{
"token" : "5557321",
"start_offset" : 0,
"end_offset" : 11,
"type" : "word",
"position" : 0
}
]
}
希望对您有所帮助!