在 Elasticsearch 中组合范围和匹配

Combine range and match in Elasticsearch

我在 Elasticsearch 索引中有以下结构的文档:

{
  "title": 'Nutrtional facts',
  "begin_timestamp" : 1582686052,
  "end_timestamp" : 1582686093
}

{
  "title": 'Guitar facts',
  "begin_timestamp" : 1447991100,
  "end_timestamp" : 1447994100
}

{
  "title": 'Hair style facts',
  "begin_timestamp" : 1447991100,
  "end_timestamp" : 1447994100
}

{
  "title": 'Piano facts',
  "begin_timestamp" : 1554416211,
  "end_timestamp" : 1591308724
}

我的目标是检索标题匹配 facts 并且开始或结束时间戳大于当前日期和时间的文档。

title matches `facts` && begin_timestamp > CURRENT_DATE_TIME OR end_timestamp > CURRENT_DATE_TIME

我运行当前查询如下:

{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "title": "facts"
          }
        }
      ],
      "should": [
        {
          "range": {
            "begin_timestamp_for_search": {
              "gte": 1580853917
            }
          }
        },
        {
          "range": {
            "begin_timestamp_for_search": {
              "gte": 1580853917
            }
          }
        }
      ]
    }
  }
}

然而,这会匹配任何匹配 facts 的内容,并返回所有文档,无论时间戳是在当前日期和时间之前还是之后。我是 ES 的新手,想知道如何编写查询,所以唯一会返回的文档是:

{
  "title": 'Nutrtional facts',
  "begin_timestamp" : 1582686052,
  "end_timestamp" : 1582686093
}

{
  "title": 'Piano facts',
  "begin_timestamp" : 1570227141,
  "end_timestamp" : 1591308724
}
{
    "from": 0,
    "size": 200,
    "query": {
        "bool": {
            "filter": [
                {
                    "bool": {
                        "must": [
                            {
                                "bool": {
                                    "must": [
                                        {
                                            "wildcard": {
                                                "title": {
                                                    "wildcard": "*facts*",
                                                    "boost": 1
                                                }
                                            }
                                        },
                                        {
                                            "bool": {
                                                "should": [
                                                    {
                                                        "range": {
                                                            "begin_timestamp": {
                                                                "from": 1580853917,
                                                                "to": null,
                                                                "include_lower": false,
                                                                "include_upper": true,
                                                                "boost": 1
                                                            }
                                                        }
                                                    },
                                                    {
                                                        "range": {
                                                            "end_timestamp": {
                                                                "from": 1580853917,
                                                                "to": null,
                                                                "include_lower": false,
                                                                "include_upper": true,
                                                                "boost": 1
                                                            }
                                                        }
                                                    }
                                                ],
                                                "adjust_pure_negative": true,
                                                "boost": 1
                                            }
                                        }
                                    ],
                                    "adjust_pure_negative": true,
                                    "boost": 1
                                }
                            }
                        ],
                        "adjust_pure_negative": true,
                        "boost": 1
                    }
                }
            ],
            "adjust_pure_negative": true,
            "boost": 1
        }
    }
}

  1. 您的查询有一些错误的名称 - 例如,根据您的文档,begin_timestamp_for_search 应该是 begin_timestamp
  2. 您需要将 minimum_should_match 选项设置为 1 以要求至少匹配一个 should 条件 (link to Elastic documentation)

因此,您的查询应如下所示:

{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "title": "facts"
          }
        }
      ],
      "should": [
        {
          "range": {
            "begin_timestamp": {
              "gte": 1580853917
            }
          }
        },
        {
          "range": {
            "end_timestamp": {
              "gte": 1580853917
            }
          }
        }
      ],
      "minimum_should_match": 1
    }
  }
}

您的代码片段中有一些小错别字,您需要将您的 should-子句包装到另一个 bool-查询中,您 add/move 将其放入您的 must 子句中。

解决方案(用ES 7验证。5.x)

POST my_index/_bulk
{"index": {}}
{"title": "Nutrtional facts", "begin_timestamp": 1582686052, "end_timestamp": 1582686093}
{"index": {}}
{"title": "Guitar facts", "begin_timestamp": 1447991100, "end_timestamp": 1447994100}
{"index": {}}
{"title": "Hair style facts", "begin_timestamp": 1447991100, "end_timestamp": 1447994100}
{"index": {}}
{"title": "Piano facts", "begin_timestamp": 1554416211, "end_timestamp": 1591308724}


GET my_index/_search
{
  "query": {
    "bool": {
      "must": [
        {"match": {"title": "facts"}},
        {"bool": {
          "should": [
            {"range": {"begin_timestamp": {"gte": 1580853917}}},
            {"range": {"end_timestamp": {"gte": 1580853917}}}
          ]
        }}
      ]
    }
  }
}

评论 1:解决方案片段上面的代码修复了代码片段中的拼写错误:

  • 在文档和查询中使用不同的字段名称
  • 针对同一个 begin_timestamp-field
  • 查询两次

注释 2: "minimum_should_match": 1 不是必需的,因为这是仅由 should 组成的 bool 查询的默认行为-条款。

Tipp:最好将时间戳建模为 date 类型的字段。这允许您使用日期数学(例如 now 在您的查询中)。 Elasticsearch 内部会将您的日期存储为 epoch_in_millis.