Elasticsearch - multi_match 不适用于嵌套字段
Elasticsearch - multi_match does not work on nested fields
我的记录可以对单个文本字段进行多种翻译,例如:
{
"type": "movie",
"title": {
"en": "Dark Knight",
"de": "Der dunkle Ritter"
}
}
为了表示这些记录,我创建了以下索引:
{
"mappings": {
"_doc": {
"properties": {
"type": {
"type": "text",
"analyzer": "english"
},
"title": {
"type": "nested",
"properties": {
"de": {
"type": "text",
"analyzer": "german"
},
"en": {
"type": "text",
"analyzer": "english"
}
}
}
}
}
}
}
但是当我尝试使用 multi_map
查询时,它并没有 returns 预期的结果。此查询查找记录(按顶级 type
字段搜索):
{
"query": {
"multi_match" : {
"query" : "movie"
}
}
}
但是这个查询没有(通过嵌套的 title.en
字段搜索):
{
"query": {
"multi_match" : {
"query": "dark"
}
}
}
这很令人惊讶,因为如果我得到 title.en
字段的术语向量,似乎记录已正确索引:
GET /test_with_lang/_doc/1/_termvectors?pretty=true&fields=*
{
"_index": "test_with_lang",
"_type": "_doc",
"_id": "1",
"_version": 1,
"found": true,
"took": 1,
"term_vectors": {
"title.en": {
"field_statistics": {
"sum_doc_freq": 2,
"doc_count": 1,
"sum_ttf": 2
},
"terms": {
"dark": {
"term_freq": 1,
"tokens": [
{
"position": 0,
"start_offset": 0,
"end_offset": 4
}
]
},
"knight": {
"term_freq": 1,
"tokens": [
{
"position": 1,
"start_offset": 5,
"end_offset": 11
}
]
}
}
}
}
}
查询似乎也使用了正确的字段,它应该匹配以下标记之一:
Request:
GET /test_with_lang/_doc/1/_explain
{
"query": {
"multi_match" : {
"query": "dark"
}
}
}
Reply:
{
"_index": "test_with_lang",
"_type": "_doc",
"_id": "1",
"matched": false,
"explanation": {
"value": 0.0,
"description": "Failure to meet condition(s) of required/prohibited clause(s)",
"details": [
{
"value": 0.0,
"description": "no match on required clause ((type:dark | title.en:dark | title.de:dark))",
"details": [
{
"value": 0.0,
"description": "No matching clause",
"details": []
}
]
},
...
]
}
]
}
}
注意它正在字段 title.en
(no match on required clause ((type:dark | title.en:dark | title.de:dark))
) 中寻找标记 dark
。
我正在使用 Elasticsearch 6.2.1
查询似乎应该有效。我错过了什么吗?
嵌套字段需要特殊的嵌套查询:
"query": {
"nested": {
"path": "title",
"query": {
"multi_match": {
"query": "dark"
}
}
}
}
但我怀疑您的情况是否需要嵌套字段。只需为 title
字段使用常规对象类型,就可以通过简单的 multi_match
查询在所有文档字段中查找。
我的记录可以对单个文本字段进行多种翻译,例如:
{
"type": "movie",
"title": {
"en": "Dark Knight",
"de": "Der dunkle Ritter"
}
}
为了表示这些记录,我创建了以下索引:
{
"mappings": {
"_doc": {
"properties": {
"type": {
"type": "text",
"analyzer": "english"
},
"title": {
"type": "nested",
"properties": {
"de": {
"type": "text",
"analyzer": "german"
},
"en": {
"type": "text",
"analyzer": "english"
}
}
}
}
}
}
}
但是当我尝试使用 multi_map
查询时,它并没有 returns 预期的结果。此查询查找记录(按顶级 type
字段搜索):
{
"query": {
"multi_match" : {
"query" : "movie"
}
}
}
但是这个查询没有(通过嵌套的 title.en
字段搜索):
{
"query": {
"multi_match" : {
"query": "dark"
}
}
}
这很令人惊讶,因为如果我得到 title.en
字段的术语向量,似乎记录已正确索引:
GET /test_with_lang/_doc/1/_termvectors?pretty=true&fields=*
{
"_index": "test_with_lang",
"_type": "_doc",
"_id": "1",
"_version": 1,
"found": true,
"took": 1,
"term_vectors": {
"title.en": {
"field_statistics": {
"sum_doc_freq": 2,
"doc_count": 1,
"sum_ttf": 2
},
"terms": {
"dark": {
"term_freq": 1,
"tokens": [
{
"position": 0,
"start_offset": 0,
"end_offset": 4
}
]
},
"knight": {
"term_freq": 1,
"tokens": [
{
"position": 1,
"start_offset": 5,
"end_offset": 11
}
]
}
}
}
}
}
查询似乎也使用了正确的字段,它应该匹配以下标记之一:
Request:
GET /test_with_lang/_doc/1/_explain
{
"query": {
"multi_match" : {
"query": "dark"
}
}
}
Reply:
{
"_index": "test_with_lang",
"_type": "_doc",
"_id": "1",
"matched": false,
"explanation": {
"value": 0.0,
"description": "Failure to meet condition(s) of required/prohibited clause(s)",
"details": [
{
"value": 0.0,
"description": "no match on required clause ((type:dark | title.en:dark | title.de:dark))",
"details": [
{
"value": 0.0,
"description": "No matching clause",
"details": []
}
]
},
...
]
}
]
}
}
注意它正在字段 title.en
(no match on required clause ((type:dark | title.en:dark | title.de:dark))
) 中寻找标记 dark
。
我正在使用 Elasticsearch 6.2.1
查询似乎应该有效。我错过了什么吗?
嵌套字段需要特殊的嵌套查询:
"query": {
"nested": {
"path": "title",
"query": {
"multi_match": {
"query": "dark"
}
}
}
}
但我怀疑您的情况是否需要嵌套字段。只需为 title
字段使用常规对象类型,就可以通过简单的 multi_match
查询在所有文档字段中查找。