Elasticsearch 复合查询

Elasticsearch compound query

我正在使用如下复合查询查询具有 300 条记录的弹性索引:

GET my_index/_search
{
  "size": 10,
  "query": {
    "bool": {
      "should": [
        {
          "bool": {
            "should": [
              {
                "multi_match": {
                  "query": "card",
                  "fields": [
                    "title^1.0"
                  ]
                }
              }
            ],
            "must": {
              "term": {
                  "_index": {
                    "value": "my_index"
                  }
                }
            }
          }
        }
      ]
    }
  }
}

索引必须是因为这可能是一个多索引查询,取决于某些业务逻辑(必须很可能是一个过滤器,我可以更改它,但这不是我的问题的一部分。我明白了过滤器也有相同的结果)。

虽然我希望这 return 与 should 子句匹配的文档,但我正在取回索引 (300)

中的所有文档

为什么会这样?

添加带有索引数据和搜索查询的工作示例

索引数据:

{
    "title":"card",
    "cost":"55"
}
{
    "title":"Card making",
    "cost":"55"
}
{
    "title":"elasticsearch",
    "cost":"55"
}

搜索查询:

GET /_search
{
  "query": {
    "bool": {
      "should": [
        {
          "bool": {
            "filter": [
              {
                "term": {
                  "_index": {
                    "value": "index-name"
                  }
                }
              }
            ],
            "must": [
              {
                "multi_match": {
                  "fields": [
                    "title^1.0"
                  ],
                  
                  "query": "card"
                }
              }
            ]
          }
        }
      ]
    }
  }
}

搜索结果:

"hits": [
      {
        "_index": "my_index",
        "_type": "_doc",
        "_id": "1",
        "_score": 0.7549127,
        "_source": {
          "title": "card",
          "cost": "55"
        }
      },
      {
        "_index": "my_index",
        "_type": "_doc",
        "_id": "2",
        "_score": 0.55654144,
        "_source": {
          "title": "Card making",
          "cost": "55"
        }
      }
    ]

对此的解决方法是将 minimumShouldMatch 字段添加到查询中。结果查询将变为:

GET my_index/_search
{
  "size": 10,
  "query": {
    "bool": {
      "should": [
        {
          "bool": {
            "minimum_should_match": 1,
            "should": [
              {
                "multi_match": {
                  "query": "card",
                  "fields": [
                    "title^1.0"
                  ]
                }
              }
            ],
            "must": {
              "term": {
                  "_index": {
                    "value": "my_index"
                  }
                }
            }
          }
        }
      ]
    }
  }
}

我认为这背后的原因是 bool 查询被调整为提供最大数量的匹配结果 (more-matches-is-better)。因此,如果 must/filter 子句匹配,则 should 甚至不会执行。通过添加 "minimum_should_match": 1 我们指示 elasticsearch 在返回文档之前至少匹配 1 个 should 子句。

弹性文档摘录:

The bool query takes a more-matches-is-better approach, so the score from each matching must or should clause will be added together to provide the final _score for each document.

You can use the minimum_should_match parameter to specify the number or percentage of should clauses returned documents must match.

If the bool query includes at least one should clause and no must or filter clauses, the default value is 1. Otherwise, the default value is 0.

For other valid values, see the minimum_should_match parameter.

Link 供参考 - https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-bool-query.html#bool-min-should-match