弹性搜索建议 Return 零个结果

Elastic Search Suggestions Return Zero Results

我正在尝试使用 elasticsearch_dsl python 库设置 ElasticSearch。我已经能够设置索引,并且能够使用 .filter() 方法进行搜索,但是我无法使用 .suggest 方法。

我正在尝试使用 completion 映射类型和 suggest 查询方法,因为这将用于自动完成字段(在 elastic 的文档中推荐)。

我是 elastic 的新手,所以我猜我遗漏了一些东西。 任何指导将不胜感激!

到目前为止我做了什么

我没有找到完全符合我需要的教程,但我通读了 ElasticSearch.com 和 elasticsearch_dsl 上的文档,并查看了一些示例 hereand here

PS:我在 Heroku 上使用 Searchbox Elasticsearch

索引/映射设置:

# imports [...]

edge_ngram_analyzer = analyzer(
    'edge_ngram_analyzer',
    type='custom',
    tokenizer='standard',
    filter=[
        'lowercase',
        token_filter(
            'edge_ngram_filter', type='edgeNGram',
            min_gram=1, max_gram=20
        )
    ]
)

class DocumentIndex(ElasticDocument):
    title = Text()
    title_suggest = Completion(
        analyzer=edge_ngram_analyzer,
        )
    class Index:
        name = 'documents-index'

# [...] Initialize index
# [...] Upload Documents (5,000 documents)
# DocumentIndex.init()
# [DocumentIndex(**doc).save() for doc in mydocs]

映射输出:

这是 Web 控制台中显示的映射:

 {
  "documents-index": {
    "mappings": {
      "doc": {
        "properties": {
          "title": {
            "type": "text"
          },
          "title_suggest": {
            "type": "completion",
            "analyzer": "edge_ngram_analyzer",
            "search_analyzer": "standard",
            "preserve_separators": true,
            "preserve_position_increments": true,
            "max_input_length": 50
          }
        }
      }
    }
  }
}

正在尝试搜索

验证索引是否存在:

>>> search = Search(index='documents-index')
>>> search.count()  # Returns correct amount of documents
5000
>>> [doc for doc in search.scan()][:3]
>>> [<Hit(documents-index/doc/1): ...} ...

测试搜索 - 有效:

>>> query = search.filter('match', title='class')
>>> query.execute()
>>> result.hits 
<Response: [<Hit(documents-in [ ... ]
>>> len(result.hits)
10
>>> query.to_dict()  # see query payload
{ 
  "query":{
    "bool":{
      "filter":[
        {
          "fuzzy":{
            "title":"class"
          }
        }
      ]
    }
  }
}

失败的部分

我无法使用任何 .suggest() 方法。 笔记: * 我正在关注官方library docs

测试建议:

>>> query = search.suggest(
        'title-suggestions',
        'class',
        completion={
        'field': 'title_suggest',
        'fuzzy': True
        })
>>> query.execute()
<Response: {}>
>>> query.to_dict() # see query payload
{
  "suggest": {
    "title-suggestions": {
      "text": "class",
      "completion": { "field": "title_suggest" }
    }
  }
}

我也尝试了下面的代码,显然有很多不同类型的查询和值,但结果是相似的。 (注意 .filter() 我总能得到预期的结果)。

>>> query = search.suggest(
        'title-suggestions',
        'class',
         term=dict(field='title'))
>>> query.to_dict() # see query payload
{
  "suggest": {
    "title-suggestions": { 
        "text": "class", 
        "term": { 
            "field": "title" 
        } 
    }
  }
}
>>> query.execute()
<Response: {}>

更新

根据 Honza 的建议,我将 title_suggest 映射更新为仅完成,没有自定义分析器。我还删除了索引并从头开始重新索引

class DocumentIndex(ElasticDocument):
    title = Text()
    title_suggest = Completion()
    class Index:
        name = 'documents-index'

不幸的是,问题仍然存在。这里还有一些测试:

验证 title_suggest 是否被正确索引

>>> search = Search(index='documents-index)
>>> search.index('documents-index').count()
23369
>>> [d for d in search.scan()][0].title
'AnalyticalGrid Property'
>>> [d for d in search.scan()][0].title_suggest
'AnalyticalGrid Property'

再次尝试搜索:

>>> len(search.filter('term', title='class').execute().hits)
10
>>> search.filter('term', title_suggest='Class').execute().hits
[]
>>> search.suggest('suggestions', 'class', completion={'field': 
'title_suggest'}).execute().hits
[]

验证映射:

>>> pprint(index.get_mapping())
{
  "documents-index": {
    "mappings": {
      "doc": {
        "properties": {
          "title": { "type": "text" },
          "title_suggest": {
            "analyzer": "simple",
            "max_input_length": 50,
            "preserve_position_increments": True,
            "preserve_separators": True,
            "type": "completion"
          }
        }
      }
    }
  }
}

对于完成字段,您不希望使用 ngram 分析器。 completion 字段将自动索引所有前缀并优化前缀查询,因此您做了两次工作并混淆了系统。从空 completion 字段开始,然后从那里开始。

我想将 Honza 提供的解决方案正式化为另一个答案的其中一条评论。

问题不在于映射,而在于 .suggest() 方法未在 hits 下返回。

这些建议现在可以在由以下人员返回的字典中看到:

>>> response = query.execute()
>>> print(response)
<Response: {}>
>>> response.to_dict()
# output is
# {'query': {},
# 'suggest': {'title-suggestions': {'completion': {'field': 'title_suggest'},
# [...]

我还找到了有关此 github issue 的其他详细信息:

HonzaKral commented 27 days ago

The Response object provides access to any and all fields that have been returned by elasticsearch. For convenience there is a shortcut that allow to iterate over the hits as that is both most common and also easy to do. For other parts of the response, like aggregations or suggestions, you need to access them explicitly like response.suggest.foo.options.