ElasticSearch 中多个字段的单词和短语搜索

Word and phrase search on multiple fields in ElasticSearch

我想通过 ElasticSearch 使用 Python 搜索文档。我正在寻找在三个字段中的任何一个字段中包含单词 and/or 短语的文档。

GET /my_docs/_search
{
  "query": {
    "multi_match": {
      "query": "Ford \"lone star\"",
      "fields": [
        "title",
        "description",
        "news_content"
      ],
      "minimum_should_match": "-1",
      "operator": "AND"
    }
  }
}

在上面的查询中,我想获取标题、描述或 news_content 包含 "Ford" 和 "lone star"(作为短语)的文档。

但是,它似乎不认为"lone star"是一个短语。它 returns 记录 "Ford"、"lone" 和 "star"。

因此,我能够重现您的问题并使用 Elasticsearch 的 REST API 解决了它,因为我不熟悉 python 语法,很高兴您在 JSON 格式,我在它的基础上构建了我的解决方案。

索引定义

{
    "mappings": {
        "properties": {
            "title": {
                "type": "text"
            },
            "description" :{
                "type" : "text"
            },
            "news_content" : {
                "type" : "text"
            }
        }
    }
}

示例文档

{
  "title" : "Ford",
  "news_content" : "lone star", --> note this matches your criteria
  "description" : "foo bar"
}

{
  "title" : "Ford",
  "news_content" : "lone",
  "description" : "star"
}

您正在寻找的搜索查询

{
    "query": {
        "bool": {
            "must": [ --> note this, both clause must match
                {
                    "multi_match": {
                        "query": "ford",
                        "fields": [
                            "title",
                            "description",
                            "news_content"
                        ]
                    }
                },
                {
                    "multi_match": {
                        "query": "lone star",
                        "fields": [
                            "title",
                            "description",
                            "news_content"
                        ],
                        "type": "phrase" --> note `lone star` must be phrase
                    }
                }
            ]
        }
    }
}

结果仅包含样本中的一份文档

"hits": [
      {
        "_index": "so_phrase",
        "_type": "_doc",
        "_id": "1",
        "_score": 0.9527341,
        "_source": {
          "title": "Ford",
          "news_content": "lone star",
          "description": "foo bar"
        }
      }
    ]