Elasticsearch 基于多条件查询
Elasticsearch query Based on multiple conditions
我正在尝试基于多个参数进行搜索。
例如,我的文档结构如下,
{
"id": "101",
"start_time": "2021-12-13T06:57:29.420198Z",
"end_time": "2021-12-13T07:00:23.511722Z",
"data": [{
"starttimestamp": "2022-01-03T11:21:22.107413Z",
"user": "John",
"speech": "Thanks. I’m just happy that it’s over. I was really nervous about it.",
"endtimestamp": "2022-01-03T11:21:22.247482Z"
},
{
"starttimestamp": "2022-01-03T11:21:26.905401Z",
"user": "Tom",
"speech": "Well, I’m sure you did great. where is the university",
"endtimestamp": "2022-01-03T11:21:27.065316Z"
},
{
"starttimestamp": "2022-01-03T11:21:33.165617Z",
"user": "John",
"speech": "university is in canada.",
"endtimestamp": "2022-01-03T11:21:33.165900Z"
}
]
}
现在我需要搜索特定用户,
例如:
正面案例:搜索用户为“John”的文档,他在开始时间戳“2022-01-03T11:21:33.165617Z”谈到了“加拿大”
输出:我们应该得到上述查询的结果
否定案例:搜索用户“Tom”在开始时间戳“2022-01-03T11:21:33.165617Z”谈到“加拿大”的文档
输出:这里的结果应该是空的,因为“汤姆”没有谈论“加拿大”
我尝试使用以下查询实现预期 o/p,
{
"query": {
"bool": {
"must": [
{
"term": {"data.user": "Tom"}
},
{
"bool": {
"must": [
{"term": {"data.speech": "canada"}}
]
}
}
]
}
}
}
在上面的查询中,我们应该得到一个空列表,但我们得到的结果是上面的文档。
我参考了一些其他资源:
我没有找到可以解决我的问题的确切信息。
请查看此 documentation 以了解数组字段在 Elasticsearch 中的工作方式。由于您使用的是对象数组,因此无法查询每个对象。
Arrays of objects do not work as you would expect: you cannot query
each object independently of the other objects in the array. If you
need to be able to do this then you should use the nested data type
instead of the object data type.
您可以使用以下示例创建索引:
PUT index_name
{
"mappings": {
"properties": {
"data":{
"type": "nested"
}
}
}
}
索引在文档下方:
POST index_name/_doc
{
"id": "101",
"start_time": "2021-12-13T06:57:29.420198Z",
"end_time": "2021-12-13T07:00:23.511722Z",
"data": [
{
"starttimestamp": "2022-01-03T11:21:22.107413Z",
"user": "John",
"speech": "Thanks. I’m just happy that it’s over. I was really nervous about it.",
"endtimestamp": "2022-01-03T11:21:22.247482Z"
},
{
"starttimestamp": "2022-01-03T11:21:26.905401Z",
"user": "Tom",
"speech": "Well, I’m sure you did great. where is the university",
"endtimestamp": "2022-01-03T11:21:27.065316Z"
},
{
"starttimestamp": "2022-01-03T11:21:33.165617Z",
"user": "John",
"speech": "university is in canada.",
"endtimestamp": "2022-01-03T11:21:33.165900Z"
}
]
}
示例查询:
POST index_name/_search
{
"query": {
"nested": {
"path": "data",
"query": {
"bool": {
"must": [
{
"match": {
"data.user": "John"
}
},
{
"match": {
"data.speech": "canada"
}
}
]
}
}
}
}
}
请查看 this 博客以更清楚地了解对象和嵌套类型。
我正在尝试基于多个参数进行搜索。 例如,我的文档结构如下,
{
"id": "101",
"start_time": "2021-12-13T06:57:29.420198Z",
"end_time": "2021-12-13T07:00:23.511722Z",
"data": [{
"starttimestamp": "2022-01-03T11:21:22.107413Z",
"user": "John",
"speech": "Thanks. I’m just happy that it’s over. I was really nervous about it.",
"endtimestamp": "2022-01-03T11:21:22.247482Z"
},
{
"starttimestamp": "2022-01-03T11:21:26.905401Z",
"user": "Tom",
"speech": "Well, I’m sure you did great. where is the university",
"endtimestamp": "2022-01-03T11:21:27.065316Z"
},
{
"starttimestamp": "2022-01-03T11:21:33.165617Z",
"user": "John",
"speech": "university is in canada.",
"endtimestamp": "2022-01-03T11:21:33.165900Z"
}
]
}
现在我需要搜索特定用户,
例如: 正面案例:搜索用户为“John”的文档,他在开始时间戳“2022-01-03T11:21:33.165617Z”谈到了“加拿大” 输出:我们应该得到上述查询的结果
否定案例:搜索用户“Tom”在开始时间戳“2022-01-03T11:21:33.165617Z”谈到“加拿大”的文档 输出:这里的结果应该是空的,因为“汤姆”没有谈论“加拿大”
我尝试使用以下查询实现预期 o/p,
{
"query": {
"bool": {
"must": [
{
"term": {"data.user": "Tom"}
},
{
"bool": {
"must": [
{"term": {"data.speech": "canada"}}
]
}
}
]
}
}
}
在上面的查询中,我们应该得到一个空列表,但我们得到的结果是上面的文档。
我参考了一些其他资源:
我没有找到可以解决我的问题的确切信息。
请查看此 documentation 以了解数组字段在 Elasticsearch 中的工作方式。由于您使用的是对象数组,因此无法查询每个对象。
Arrays of objects do not work as you would expect: you cannot query each object independently of the other objects in the array. If you need to be able to do this then you should use the nested data type instead of the object data type.
您可以使用以下示例创建索引:
PUT index_name
{
"mappings": {
"properties": {
"data":{
"type": "nested"
}
}
}
}
索引在文档下方:
POST index_name/_doc
{
"id": "101",
"start_time": "2021-12-13T06:57:29.420198Z",
"end_time": "2021-12-13T07:00:23.511722Z",
"data": [
{
"starttimestamp": "2022-01-03T11:21:22.107413Z",
"user": "John",
"speech": "Thanks. I’m just happy that it’s over. I was really nervous about it.",
"endtimestamp": "2022-01-03T11:21:22.247482Z"
},
{
"starttimestamp": "2022-01-03T11:21:26.905401Z",
"user": "Tom",
"speech": "Well, I’m sure you did great. where is the university",
"endtimestamp": "2022-01-03T11:21:27.065316Z"
},
{
"starttimestamp": "2022-01-03T11:21:33.165617Z",
"user": "John",
"speech": "university is in canada.",
"endtimestamp": "2022-01-03T11:21:33.165900Z"
}
]
}
示例查询:
POST index_name/_search
{
"query": {
"nested": {
"path": "data",
"query": {
"bool": {
"must": [
{
"match": {
"data.user": "John"
}
},
{
"match": {
"data.speech": "canada"
}
}
]
}
}
}
}
}
请查看 this 博客以更清楚地了解对象和嵌套类型。