在 Elasticsearch 中搜索数组时获取对象命中
Get object hit when searching array in Elasticsearch
我正在尝试从存储在 elasticsearch 中的 JSON 数组中获取一个对象。布局是这样的:
[
object{}
object{}
object{}
]
当我进行搜索并且搜索到其中一个对象时,我需要做什么,以获取它匹配的特定对象。目前,使用 java API 我正在搜索:
QueryBuilder qb = QueryBuilders.boolQuery()
.should(QueryBuilders.matchQuery("text", "pottery").boost(5)
.minimumShouldMatch("1"));
SearchResponse response = client.prepareSearch("stuff")
.setTypes("things")
.setSearchType(SearchType.DFS_QUERY_THEN_FETCH)
.setQuery(qb)
.setPostFilter(filter)//.setHighlighterQuery(qb)
.addField("places.numbers")
.addField("name")
.addField("city")
.setFrom(0).setSize(60).setExplain(true)
.execute()
.actionGet();
但这只会 return 我击中的整个对象,或者当我告诉它 return 字段 "places.numbers" 它只会 return 中的第一个对象"palces" 数组,而不是查询中匹配的数组。
感谢您的帮助!
有几种方法可以解决这个问题。我可能会用 nested type and inner hits, given what you've shown in your question, but it could also probably be done with the parent/child relationship.
这是嵌套文档的示例。我设置了一个像这样的简单索引:
PUT /test_index
{
"mappings": {
"parent_doc": {
"properties": {
"parent_name": {
"type": "string"
},
"nested_docs": {
"type": "nested",
"properties": {
"nested_name": {
"type": "string"
}
}
}
}
}
}
}
然后添加了几个简单的文档:
POST /test_index/parent_doc/_bulk
{"index":{"_id":1}}
{"parent_name":"p1","nested_docs":[{"nested_name":"n1"},{"nested_name":"n2"}]}
{"index":{"_id":2}}
{"parent_name":"p2","nested_docs":[{"nested_name":"n3"},{"nested_name":"n4"}]}
现在我可以这样搜索了,使用 "inner_hits"
:
POST /test_index/_search
{
"query": {
"nested": {
"path": "nested_docs",
"query": {
"match": {
"nested_docs.nested_name": "n3"
}
},
"inner_hits" : {}
}
}
}
哪个returns:
{
"took": 4,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 2.098612,
"hits": [
{
"_index": "test_index",
"_type": "parent_doc",
"_id": "2",
"_score": 2.098612,
"_source": {
"parent_name": "p2",
"nested_docs": [
{
"nested_name": "n3"
},
{
"nested_name": "n4"
}
]
},
"inner_hits": {
"nested_docs": {
"hits": {
"total": 1,
"max_score": 2.098612,
"hits": [
{
"_index": "test_index",
"_type": "parent_doc",
"_id": "2",
"_nested": {
"field": "nested_docs",
"offset": 0
},
"_score": 2.098612,
"_source": {
"nested_name": "n3"
}
}
]
}
}
}
}
]
}
}
这是我用来测试它的代码:
http://sense.qbox.io/gist/ef7debf436fec2a10097ba2106d5ff30ff8d7c77
我正在尝试从存储在 elasticsearch 中的 JSON 数组中获取一个对象。布局是这样的:
[
object{}
object{}
object{}
]
当我进行搜索并且搜索到其中一个对象时,我需要做什么,以获取它匹配的特定对象。目前,使用 java API 我正在搜索:
QueryBuilder qb = QueryBuilders.boolQuery()
.should(QueryBuilders.matchQuery("text", "pottery").boost(5)
.minimumShouldMatch("1"));
SearchResponse response = client.prepareSearch("stuff")
.setTypes("things")
.setSearchType(SearchType.DFS_QUERY_THEN_FETCH)
.setQuery(qb)
.setPostFilter(filter)//.setHighlighterQuery(qb)
.addField("places.numbers")
.addField("name")
.addField("city")
.setFrom(0).setSize(60).setExplain(true)
.execute()
.actionGet();
但这只会 return 我击中的整个对象,或者当我告诉它 return 字段 "places.numbers" 它只会 return 中的第一个对象"palces" 数组,而不是查询中匹配的数组。
感谢您的帮助!
有几种方法可以解决这个问题。我可能会用 nested type and inner hits, given what you've shown in your question, but it could also probably be done with the parent/child relationship.
这是嵌套文档的示例。我设置了一个像这样的简单索引:
PUT /test_index
{
"mappings": {
"parent_doc": {
"properties": {
"parent_name": {
"type": "string"
},
"nested_docs": {
"type": "nested",
"properties": {
"nested_name": {
"type": "string"
}
}
}
}
}
}
}
然后添加了几个简单的文档:
POST /test_index/parent_doc/_bulk
{"index":{"_id":1}}
{"parent_name":"p1","nested_docs":[{"nested_name":"n1"},{"nested_name":"n2"}]}
{"index":{"_id":2}}
{"parent_name":"p2","nested_docs":[{"nested_name":"n3"},{"nested_name":"n4"}]}
现在我可以这样搜索了,使用 "inner_hits"
:
POST /test_index/_search
{
"query": {
"nested": {
"path": "nested_docs",
"query": {
"match": {
"nested_docs.nested_name": "n3"
}
},
"inner_hits" : {}
}
}
}
哪个returns:
{
"took": 4,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 2.098612,
"hits": [
{
"_index": "test_index",
"_type": "parent_doc",
"_id": "2",
"_score": 2.098612,
"_source": {
"parent_name": "p2",
"nested_docs": [
{
"nested_name": "n3"
},
{
"nested_name": "n4"
}
]
},
"inner_hits": {
"nested_docs": {
"hits": {
"total": 1,
"max_score": 2.098612,
"hits": [
{
"_index": "test_index",
"_type": "parent_doc",
"_id": "2",
"_nested": {
"field": "nested_docs",
"offset": 0
},
"_score": 2.098612,
"_source": {
"nested_name": "n3"
}
}
]
}
}
}
}
]
}
}
这是我用来测试它的代码:
http://sense.qbox.io/gist/ef7debf436fec2a10097ba2106d5ff30ff8d7c77