在 ElasticSearch 中搜索时删除连字符

Question

我想使用 ElasticSearch 和 SpringData 创建图书搜索。

我用不带连字符的 ISBN/EAN 为我的书编制索引，并将其保存在我的数据库中。我用 ElasticSearch 索引了这些数据。

索引数据：1113333444444 如果我要搜索带连字符的 ISBN/EAN：111-3333-444444

没有结果。如果我不使用连字符进行搜索，我的书将按预期找到。

我的设置是这样的：

{
  "analysis": {
    "filter": {
      "clean_special": {
        "type": "pattern_replace",
        "pattern": "[^a-zA-Z0-9]",
        "replacement": ""
      }
    },
    "analyzer": {
      "isbn_search_analyzer": {
        "type": "custom",
        "tokenizer": "keyword",
        "filter": [
          "clean_special"
        ]
      }
    }
  }
}

我这样索引我的字段：

   @Field(type = FieldType.Keyword, searchAnalyzer = "isbn_search_analyzer")
   private String isbn;
   @Field(type = FieldType.Keyword, searchAnalyzer = "isbn_search_analyzer")
   private String ean;

如果我测试我的分析仪：

GET indexname/_analyze
{
  "analyzer" : "isbn_search_analyzer",
  "text" : "111-3333-444444"
}

我得到以下结果：

{
  "tokens" : [
    {
      "token" : "1113333444444",
      "start_offset" : 0,
      "end_offset" : 15,
      "type" : "word",
      "position" : 0
    }
  ]
}

如果我这样搜索：

GET indexname/_search
{
   "query": {
    "query_string": {
      "fields": [ "isbn", "ean" ],
      "query": "111-3333-444444"
    }
  }
}

我没有得到任何结果。你们有人有想法吗？

Answer 1

Elasticsearch 不分析 keyword 类型的字段。您需要将类型设置为 text.

Answer 2

如@P.J.Meisch 所述，您已完成所有正确操作，但是当您将字段数据类型定义为 keyword 时，您错过了将它们定义为 text，即使您是明确告诉 ElasticSearch 使用您的自定义分析器 isbn_search_analyzer，它将被忽略。

当字段定义为 text.

时，示例数据的工作示例

索引映射

{
    "settings": {
        "analysis": {
            "filter": {
                "clean_special": {
                    "type": "pattern_replace",
                    "pattern": "[^a-zA-Z0-9]",
                    "replacement": ""
                }
            },
            "analyzer": {
                "isbn_search_analyzer": {
                    "type": "custom",
                    "tokenizer": "keyword",
                    "filter": [
                        "clean_special"
                    ]
                }
            }
        }
    },
    "mappings": {
        "properties": {
            "isbn": {
                "type": "text",
                "analyzer": "isbn_search_analyzer"
            },
            "ean": {
                "type": "text",
                "analyzer": "isbn_search_analyzer"
            }
        }
    }
}

索引示例记录

{
    "isbn" : "111-3333-444444"
}

{
    "isbn" : "111-3333-2222"
}

搜索查询

{
    "query": {
        "query_string": {
            "fields": [
                "isbn",
                "ean"
            ],
            "query": "111-3333-444444"
        }
    }
}

和搜索响应

"hits": [
            {
                "_index": "65780647",
                "_type": "_doc",
                "_id": "1",
                "_score": 0.6931471,
                "_source": {
                    "isbn": "111-3333-444444"
                }
            }
        ]

在 ElasticSearch 中搜索时删除连字符

Remove hyphens while search time in ElasticSearch

java

elasticsearch

spring-boot

spring-data-elasticsearch