在 Elasticsearch 中组合范围和匹配
Combine range and match in Elasticsearch
我在 Elasticsearch 索引中有以下结构的文档:
{
"title": 'Nutrtional facts',
"begin_timestamp" : 1582686052,
"end_timestamp" : 1582686093
}
{
"title": 'Guitar facts',
"begin_timestamp" : 1447991100,
"end_timestamp" : 1447994100
}
{
"title": 'Hair style facts',
"begin_timestamp" : 1447991100,
"end_timestamp" : 1447994100
}
{
"title": 'Piano facts',
"begin_timestamp" : 1554416211,
"end_timestamp" : 1591308724
}
我的目标是检索标题匹配 facts
并且开始或结束时间戳大于当前日期和时间的文档。
title matches `facts` && begin_timestamp > CURRENT_DATE_TIME OR end_timestamp > CURRENT_DATE_TIME
我运行当前查询如下:
{
"query": {
"bool": {
"must": [
{
"match": {
"title": "facts"
}
}
],
"should": [
{
"range": {
"begin_timestamp_for_search": {
"gte": 1580853917
}
}
},
{
"range": {
"begin_timestamp_for_search": {
"gte": 1580853917
}
}
}
]
}
}
}
然而,这会匹配任何匹配 facts
的内容,并返回所有文档,无论时间戳是在当前日期和时间之前还是之后。我是 ES 的新手,想知道如何编写查询,所以唯一会返回的文档是:
{
"title": 'Nutrtional facts',
"begin_timestamp" : 1582686052,
"end_timestamp" : 1582686093
}
{
"title": 'Piano facts',
"begin_timestamp" : 1570227141,
"end_timestamp" : 1591308724
}
{
"from": 0,
"size": 200,
"query": {
"bool": {
"filter": [
{
"bool": {
"must": [
{
"bool": {
"must": [
{
"wildcard": {
"title": {
"wildcard": "*facts*",
"boost": 1
}
}
},
{
"bool": {
"should": [
{
"range": {
"begin_timestamp": {
"from": 1580853917,
"to": null,
"include_lower": false,
"include_upper": true,
"boost": 1
}
}
},
{
"range": {
"end_timestamp": {
"from": 1580853917,
"to": null,
"include_lower": false,
"include_upper": true,
"boost": 1
}
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
}
}
- 您的查询有一些错误的名称 - 例如,根据您的文档,
begin_timestamp_for_search
应该是 begin_timestamp
- 您需要将
minimum_should_match
选项设置为 1 以要求至少匹配一个 should
条件 (link to Elastic documentation)
因此,您的查询应如下所示:
{
"query": {
"bool": {
"must": [
{
"match": {
"title": "facts"
}
}
],
"should": [
{
"range": {
"begin_timestamp": {
"gte": 1580853917
}
}
},
{
"range": {
"end_timestamp": {
"gte": 1580853917
}
}
}
],
"minimum_should_match": 1
}
}
}
您的代码片段中有一些小错别字,您需要将您的 should
-子句包装到另一个 bool
-查询中,您 add/move 将其放入您的 must 子句中。
解决方案(用ES 7验证。5.x)
POST my_index/_bulk
{"index": {}}
{"title": "Nutrtional facts", "begin_timestamp": 1582686052, "end_timestamp": 1582686093}
{"index": {}}
{"title": "Guitar facts", "begin_timestamp": 1447991100, "end_timestamp": 1447994100}
{"index": {}}
{"title": "Hair style facts", "begin_timestamp": 1447991100, "end_timestamp": 1447994100}
{"index": {}}
{"title": "Piano facts", "begin_timestamp": 1554416211, "end_timestamp": 1591308724}
GET my_index/_search
{
"query": {
"bool": {
"must": [
{"match": {"title": "facts"}},
{"bool": {
"should": [
{"range": {"begin_timestamp": {"gte": 1580853917}}},
{"range": {"end_timestamp": {"gte": 1580853917}}}
]
}}
]
}
}
}
评论 1:解决方案片段上面的代码修复了代码片段中的拼写错误:
- 在文档和查询中使用不同的字段名称
- 针对同一个
begin_timestamp
-field 查询两次
注释 2: "minimum_should_match": 1
不是必需的,因为这是仅由 should
组成的 bool
查询的默认行为-条款。
Tipp:最好将时间戳建模为 date
类型的字段。这允许您使用日期数学(例如 now
在您的查询中)。 Elasticsearch 内部会将您的日期存储为 epoch_in_millis.
我在 Elasticsearch 索引中有以下结构的文档:
{
"title": 'Nutrtional facts',
"begin_timestamp" : 1582686052,
"end_timestamp" : 1582686093
}
{
"title": 'Guitar facts',
"begin_timestamp" : 1447991100,
"end_timestamp" : 1447994100
}
{
"title": 'Hair style facts',
"begin_timestamp" : 1447991100,
"end_timestamp" : 1447994100
}
{
"title": 'Piano facts',
"begin_timestamp" : 1554416211,
"end_timestamp" : 1591308724
}
我的目标是检索标题匹配 facts
并且开始或结束时间戳大于当前日期和时间的文档。
title matches `facts` && begin_timestamp > CURRENT_DATE_TIME OR end_timestamp > CURRENT_DATE_TIME
我运行当前查询如下:
{
"query": {
"bool": {
"must": [
{
"match": {
"title": "facts"
}
}
],
"should": [
{
"range": {
"begin_timestamp_for_search": {
"gte": 1580853917
}
}
},
{
"range": {
"begin_timestamp_for_search": {
"gte": 1580853917
}
}
}
]
}
}
}
然而,这会匹配任何匹配 facts
的内容,并返回所有文档,无论时间戳是在当前日期和时间之前还是之后。我是 ES 的新手,想知道如何编写查询,所以唯一会返回的文档是:
{
"title": 'Nutrtional facts',
"begin_timestamp" : 1582686052,
"end_timestamp" : 1582686093
}
{
"title": 'Piano facts',
"begin_timestamp" : 1570227141,
"end_timestamp" : 1591308724
}
{
"from": 0,
"size": 200,
"query": {
"bool": {
"filter": [
{
"bool": {
"must": [
{
"bool": {
"must": [
{
"wildcard": {
"title": {
"wildcard": "*facts*",
"boost": 1
}
}
},
{
"bool": {
"should": [
{
"range": {
"begin_timestamp": {
"from": 1580853917,
"to": null,
"include_lower": false,
"include_upper": true,
"boost": 1
}
}
},
{
"range": {
"end_timestamp": {
"from": 1580853917,
"to": null,
"include_lower": false,
"include_upper": true,
"boost": 1
}
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
}
}
- 您的查询有一些错误的名称 - 例如,根据您的文档,
begin_timestamp_for_search
应该是begin_timestamp
- 您需要将
minimum_should_match
选项设置为 1 以要求至少匹配一个should
条件 (link to Elastic documentation)
因此,您的查询应如下所示:
{
"query": {
"bool": {
"must": [
{
"match": {
"title": "facts"
}
}
],
"should": [
{
"range": {
"begin_timestamp": {
"gte": 1580853917
}
}
},
{
"range": {
"end_timestamp": {
"gte": 1580853917
}
}
}
],
"minimum_should_match": 1
}
}
}
您的代码片段中有一些小错别字,您需要将您的 should
-子句包装到另一个 bool
-查询中,您 add/move 将其放入您的 must 子句中。
解决方案(用ES 7验证。5.x)
POST my_index/_bulk
{"index": {}}
{"title": "Nutrtional facts", "begin_timestamp": 1582686052, "end_timestamp": 1582686093}
{"index": {}}
{"title": "Guitar facts", "begin_timestamp": 1447991100, "end_timestamp": 1447994100}
{"index": {}}
{"title": "Hair style facts", "begin_timestamp": 1447991100, "end_timestamp": 1447994100}
{"index": {}}
{"title": "Piano facts", "begin_timestamp": 1554416211, "end_timestamp": 1591308724}
GET my_index/_search
{
"query": {
"bool": {
"must": [
{"match": {"title": "facts"}},
{"bool": {
"should": [
{"range": {"begin_timestamp": {"gte": 1580853917}}},
{"range": {"end_timestamp": {"gte": 1580853917}}}
]
}}
]
}
}
}
评论 1:解决方案片段上面的代码修复了代码片段中的拼写错误:
- 在文档和查询中使用不同的字段名称
- 针对同一个
begin_timestamp
-field 查询两次
注释 2: "minimum_should_match": 1
不是必需的,因为这是仅由 should
组成的 bool
查询的默认行为-条款。
Tipp:最好将时间戳建模为 date
类型的字段。这允许您使用日期数学(例如 now
在您的查询中)。 Elasticsearch 内部会将您的日期存储为 epoch_in_millis.