使用 elasticsearch 进行数组搜索时包含的数组
Array included in array search with elasticsearch
我将用户按以下类别编入索引
{
id: 1
name: John
categories: [
{
id: 1
name: Category 1
},
{
id: 2
name: Category 2
}
]
},
{
id: 2
name: Mark
categories: [
{
id: 1
name: Category 1
}
]
}
我正在尝试使用
获取类别 1 或类别 2 的所有文档
{
filter:
{
bool: {
must: [
{
terms: {user.categories.id: [1, 2]}
}
]
}
}
}
但是它只有returns第一个文件,有两个类,我哪里做错了?
据我了解,术语搜索其中一个值包含在该字段中,因此对于用户 1
user.categories.id: [1, 2]
用户 2
user.categories.id: [1]
类别 id 1 包含在两个文档中
处理此问题的最佳方法可能是使用 nested filter。不过,您必须在映射中指定 "nested"
类型。
我可以这样设置索引:
PUT /test_index
{
"settings": {
"number_of_shards": 1,
"number_of_replicas": 0
},
"mappings": {
"doc": {
"properties": {
"categories": {
"type": "nested",
"properties": {
"id": {
"type": "long"
},
"name": {
"type": "string"
}
}
},
"id": {
"type": "long"
},
"name": {
"type": "string"
}
}
}
}
}
然后添加一些文档:
PUT /test_index/doc/1
{
"id": 1,
"name": "John",
"categories": [
{ "id": 1, "name": "Category 1" },
{ "id": 2, "name": "Category 2" }
]
}
PUT /test_index/doc/2
{
"id": 2,
"name": "Mark",
"categories": [
{ "id": 1, "name": "Category 1" }
]
}
PUT /test_index/doc/3
{
"id": 3,
"name": "Bill",
"categories": [
{ "id": 3, "name": "Category 3" },
{ "id": 4, "name": "Category 4" }
]
}
现在我可以像这样使用嵌套的术语过滤器:
POST /test_index/doc/_search
{
"query": {
"constant_score": {
"filter": {
"nested": {
"path": "categories",
"filter": {
"terms": {
"categories.id": [1, 2]
}
}
}
},
"boost": 1.2
}
}
}
...
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 1,
"hits": [
{
"_index": "test_index",
"_type": "doc",
"_id": "1",
"_score": 1,
"_source": {
"id": 1,
"name": "John",
"categories": [
{
"id": 1,
"name": "Category 1"
},
{
"id": 2,
"name": "Category 2"
}
]
}
},
{
"_index": "test_index",
"_type": "doc",
"_id": "2",
"_score": 1,
"_source": {
"id": 2,
"name": "Mark",
"categories": [
{
"id": 1,
"name": "Category 1"
}
]
}
}
]
}
}
这是我使用的代码:
http://sense.qbox.io/gist/668aefe910643b52a3a10d40aca67104491668fc
我将用户按以下类别编入索引
{
id: 1
name: John
categories: [
{
id: 1
name: Category 1
},
{
id: 2
name: Category 2
}
]
},
{
id: 2
name: Mark
categories: [
{
id: 1
name: Category 1
}
]
}
我正在尝试使用
获取类别 1 或类别 2 的所有文档{
filter:
{
bool: {
must: [
{
terms: {user.categories.id: [1, 2]}
}
]
}
}
}
但是它只有returns第一个文件,有两个类,我哪里做错了? 据我了解,术语搜索其中一个值包含在该字段中,因此对于用户 1 user.categories.id: [1, 2] 用户 2 user.categories.id: [1] 类别 id 1 包含在两个文档中
处理此问题的最佳方法可能是使用 nested filter。不过,您必须在映射中指定 "nested"
类型。
我可以这样设置索引:
PUT /test_index
{
"settings": {
"number_of_shards": 1,
"number_of_replicas": 0
},
"mappings": {
"doc": {
"properties": {
"categories": {
"type": "nested",
"properties": {
"id": {
"type": "long"
},
"name": {
"type": "string"
}
}
},
"id": {
"type": "long"
},
"name": {
"type": "string"
}
}
}
}
}
然后添加一些文档:
PUT /test_index/doc/1
{
"id": 1,
"name": "John",
"categories": [
{ "id": 1, "name": "Category 1" },
{ "id": 2, "name": "Category 2" }
]
}
PUT /test_index/doc/2
{
"id": 2,
"name": "Mark",
"categories": [
{ "id": 1, "name": "Category 1" }
]
}
PUT /test_index/doc/3
{
"id": 3,
"name": "Bill",
"categories": [
{ "id": 3, "name": "Category 3" },
{ "id": 4, "name": "Category 4" }
]
}
现在我可以像这样使用嵌套的术语过滤器:
POST /test_index/doc/_search
{
"query": {
"constant_score": {
"filter": {
"nested": {
"path": "categories",
"filter": {
"terms": {
"categories.id": [1, 2]
}
}
}
},
"boost": 1.2
}
}
}
...
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 1,
"hits": [
{
"_index": "test_index",
"_type": "doc",
"_id": "1",
"_score": 1,
"_source": {
"id": 1,
"name": "John",
"categories": [
{
"id": 1,
"name": "Category 1"
},
{
"id": 2,
"name": "Category 2"
}
]
}
},
{
"_index": "test_index",
"_type": "doc",
"_id": "2",
"_score": 1,
"_source": {
"id": 2,
"name": "Mark",
"categories": [
{
"id": 1,
"name": "Category 1"
}
]
}
}
]
}
}
这是我使用的代码:
http://sense.qbox.io/gist/668aefe910643b52a3a10d40aca67104491668fc