如何搜索不区分大小写的elasticsearch

How to search elasticsearch case insensitive

我正在使用 php 的 elasticsearch 客户端库。我想创建一个索引,索引一个人的 id 和他的 name,并允许用户以非常灵活的方式搜索名称(不区分大小写,搜索部分名称等)

这是我目前所拥有的代码片段,为方便起见用注释进行了注释

<?php

require_once(__DIR__ . '/../init.php');

$client = new Elasticsearch\Client();
$params = [
    'index' => 'person',
    'body' => [
        'settings' => [
            // Simple setings for now, single shard
            'number_of_shards' => 1,
            'number_of_replicas' => 0,
            'analysis' => [
                'filter' => [
                    'shingle' => [
                        'type' => 'shingle'
                    ]
                ],
                'analyzer' => [
                    'my_ngram_analyzer' => [
                        'tokenizer' => 'my_ngram_tokenizer',
                    ]
                ],
                // Allow searching for partial names with nGram
                'tokenizer' => [
                    'my_ngram_tokenizer' => [
                        'type' => 'nGram',
                        'min_gram' => 1,
                        'max_gram' => 15,
                        'token_chars' => ['letter', 'digit']
                    ]
                ]
            ]
        ],
        'mappings' => [
            '_default_' => [
                'properties' => [
                    'person_id' => [
                        'type' => 'string',
                        'index' => 'not_analyzed',
                    ],
                    // The name of the person
                    'value' => [
                        'type' => 'string',
                        'analyzer' => 'my_ngram_analyzer',
                        'term_vector' => 'yes',
                        'copy_to' => 'combined'
                    ],
                ]
            ],
        ]
    ]
];

// Create index `person` with ngram indexing
$client->indices()->create($params);

// Index a single person using this indexing scheme
$params = array();
$params['body']  = array('person_id' => '1234', 'value' => 'Johnny Appleseed');
$params['index'] = 'person';
$params['type']  = 'type';
$params['id']    = 'id';
$ret = $client->index($params);

// Get that document (to prove it's in there)
$getParams = array();
$getParams['index'] = 'person';
$getParams['type']  = 'type';
$getParams['id']    = 'id';
$retDoc = $client->get($getParams);
print_r($retDoc); // success


// Search for that document
$searchParams['index'] = 'person';
$searchParams['type']  = 'type';
$searchParams['body']['query']['match']['value'] = 'J';
$queryResponse = $client->search($searchParams);
print_r($queryResponse); // FAILURE

// blow away index so that we can run the script again immediately
$deleteParams = array();
$deleteParams['index'] = 'person';
$retDelete = $client->indices()->delete($deleteParams);

我有时可以使用此搜索功能,但我一直在为使不区分大小写的功能按预期工作而对脚本大惊小怪,在此过程中,脚本现在无法找到任何具有 Jj 用作要匹配的查询值。

知道这里会发生什么吗?

为了修复不区分大小写的位,我添加了

'filter' => 'lowercase',

到我的 ngram 分析器。

此外,它无法开始的原因是,在使用 php 的客户端库时,您无法创建索引然后在同一脚本中搜索它。我的猜测是这里发生了一些异步的事情。所以在一个脚本中创建索引并在另一个脚本中搜索它,它应该可以工作。