edge_ngram 过滤但未分析以匹配搜索

edge_ngram filter and not analzyed to match search

我有以下弹性搜索配置:

PUT /my_index
{
    "settings": {
        "number_of_shards": 1, 
        "analysis": {
            "filter": {
                "autocomplete_filter": { 
                    "type":     "edge_ngram",
                    "min_gram": 1,
                    "max_gram": 20
                },
                "snow_filter" : {
                    "type" : "snowball",
                    "language" : "English"
                }
            },
            "analyzer": {
                "autocomplete": {
                    "type":      "custom",
                    "tokenizer": "standard",
                    "filter": [
                        "lowercase",
                        "snow_filter",
                        "autocomplete_filter" 
                    ]
                }
            }
        }
    }
}

PUT /my_index/_mapping/my_type
{
    "my_type": {
        "properties": {
            "name": {
                "type": "multi_field",
                "fields": {
                    "name": {
                        "type":            "string",
                        "index_analyzer":  "autocomplete", 
                        "search_analyzer": "snowball"
                    },
                    "not": {
                        "type": "string",
                        "index": "not_analyzed"
                    }
                }
            }
        }
    }
}


POST /my_index/my_type/_bulk
{ "index": { "_id": 1            }}
{ "name": "Brown foxes"    }
{ "index": { "_id": 2            }}
{ "name": "Yellow furballs" }
{ "index": { "_id": 3            }}
{ "name": "my discovery" }
{ "index": { "_id": 4            }}
{ "name": "myself is fun" }
{ "index": { "_id": 5            }}
{ "name": ["foxy", "foo"]    }
{ "index": { "_id": 6            }}
{ "name": ["foo bar", "baz"] }

我正在尝试仅搜索名称为 "foo bar" 的 return 项目 6,但我不太确定如何搜索。这就是我现在正在做的事情:

GET /my_index/my_type/_search
{
    "query": {
        "match": {
            "name": {
                "query":    "foo b"
            }
        }
    }
}

我知道这是分词器如何拆分单词的组合,但有点迷失了如何既灵活又严格以匹配它。我猜我需要在我的名称映射上做一个多字段,但我不确定。如何修复查询 and/or 我的映射以满足我的需要?

你已经很接近了。由于您的 edge_ngram 分析器生成的标记的最小长度为 1,并且您的查询被标记为 "foo""b",并且默认的 match query operator"or",您的查询匹配每个具有以 "b"(或 "foo")开头的术语的文档,其中三个文档。

使用 "and" 运算符似乎可以满足您的要求:

POST /my_index/my_type/_search
{
    "query": {
        "match": {
            "name": {
                "query":    "foo b",
                "operator": "and"
            }
        }
    }
}
...
{
   "took": 1,
   "timed_out": false,
   "_shards": {
      "total": 1,
      "successful": 1,
      "failed": 0
   },
   "hits": {
      "total": 1,
      "max_score": 1.4451914,
      "hits": [
         {
            "_index": "test_index",
            "_type": "my_type",
            "_id": "6",
            "_score": 1.4451914,
            "_source": {
               "name": [
                  "foo bar",
                  "baz"
               ]
            }
         }
      ]
   }
}

这是我用来测试它的代码:

http://sense.qbox.io/gist/4f6fb7c1fdc6942023091ee1433d7490e04e7dea