带空格的 ElasticSearch 部分映射

ElasticSearch Partial Mappings with Spaces

在涉及 space 之前,我的部分映射和查询工作得很好,例如,术语 Jon Doe 将其术语向量分解为 ..

"terms": {
            "j": {
               "term_freq": 1
            },
            "jo": {
               "term_freq": 1
            },
            "jon": {
               "term_freq": 1
            },
            "d": {
               "term_freq": 1
            },
            "do": {
               "term_freq": 1
            },
            "doe": {
               "term_freq": 1
            }
         }

但我希望它是..

"terms": {
            "j": {
               "term_freq": 1
            },
            "jo": {
               "term_freq": 1
            },
            "jon": {
               "term_freq": 1
            },
            "jon ": {
               "term_freq": 1
            },
            "jon d": {
               "term_freq": 1
            },
            "jon do": {
               "term_freq": 1
            },
            "jon doe": {
               "term_freq": 1
            }
         }

这是我的映射和设置:

映射:

   name: {
    type: 'string',
    term_vector: 'yes',
    analyzer: 'ngram_analyzer',
    search_analyzer: 'standard',
    include_in_all: true
  }

设置:

settings: {
    index: {
      analysis: {
        filter: {
          ngram_filter: {
            type: 'edge_ngram',
            min_gram: 1,
            max_gram: 15
          }
        },
        analyzer: {
          'ngram_analyzer': {
            filter: [
              'lowercase',
              'ngram_filter'
            ],
            type: 'custom',
            tokenizer: 'standard'
          }
        }
      },
      number_of_shards: 1,
      number_of_replicas: 1
    }
  }
};

我该怎么做?

您只需在自定义分析器中使用不同的分词器:

    "analyzer": {
      "ngram_analyzer": {
        "filter": [
          "lowercase",
          "ngram_filter"
        ],
        "type": "custom",
        "tokenizer": "keyword"
      }
    }