Elasticsearch:minimum_should_match 用于嵌套查询
Elasticsearch : minimum_should_match for nested query
我有一个像
这样的嵌套字段
{
"tags": [
{
"tag": "lorem ipsum"
},
{
"tag": "Lorem ipsum dolor sit amet"
}
]
}
和映射一样
{
"tags": {
**"type": "nested",**
"properties": {
"tag": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
我们可以对嵌套标签字段使用类似 minimum_should_match : 80
的东西吗?
这样我就可以通过它来控制相关度了?
例如:
如果我用 minimum_should_match: 90
搜索 "Lorem ipsum dolor" ,我不应该得到 lorem ipsum
结果。
嵌套查询只是一种访问嵌套字段的语法,因此 minimum_should_match 可以像在其他查询中一样使用
查询
{
"query": {
"nested": {
"path": "tags",
"query": {
"match": {
"tags.tag":
{
"query": "lorem ipsum dolor",
"minimum_should_match": "90%"
}
}
},
"inner_hits": {}
}
}
}
结果:
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 0.671082,
"hits" : [
{
"_index" : "index56",
"_type" : "_doc",
"_id" : "01We63ABq1Ib1oOmkJxn",
"_score" : 0.671082,
"_source" : {
"tags" : [
{
"tag" : "lorem ipsum"
},
{
"tag" : "Lorem ipsum dolor sit amet"
}
]
},
"inner_hits" : {
"tags" : {
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 0.89999837,
"hits" : [
{
"_index" : "index56",
"_type" : "_doc",
"_id" : "01We63ABq1Ib1oOmkJxn",
"_nested" : {
"field" : "tags",
"offset" : 1
},
"_score" : 0.89999837,
"_source" : {
"tag" : "Lorem ipsum dolor sit amet"
}
},
{
"_index" : "index56",
"_type" : "_doc",
"_id" : "01We63ABq1Ib1oOmkJxn",
"_nested" : {
"field" : "tags",
"offset" : 0
},
"_score" : 0.44216567,
"_source" : {
"tag" : "lorem ipsum"
}
}
]
}
}
}
}
]
}
使用 minimum-should-match:90% 时,两个嵌套文档都在 inner_hits 中返回。
原因:
来自 docs
The number computed from the percentage is rounded down and used as the minimum.
由于 22.7 的 90% 将向下舍入为 2。因此 2 个标记应该匹配。
如果 minimum-should-match:100% 那么只会返回一个嵌套文档
我有一个像
这样的嵌套字段{
"tags": [
{
"tag": "lorem ipsum"
},
{
"tag": "Lorem ipsum dolor sit amet"
}
]
}
和映射一样
{
"tags": {
**"type": "nested",**
"properties": {
"tag": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
我们可以对嵌套标签字段使用类似 minimum_should_match : 80
的东西吗?
这样我就可以通过它来控制相关度了?
例如:
如果我用 minimum_should_match: 90
搜索 "Lorem ipsum dolor" ,我不应该得到 lorem ipsum
结果。
嵌套查询只是一种访问嵌套字段的语法,因此 minimum_should_match 可以像在其他查询中一样使用
查询
{
"query": {
"nested": {
"path": "tags",
"query": {
"match": {
"tags.tag":
{
"query": "lorem ipsum dolor",
"minimum_should_match": "90%"
}
}
},
"inner_hits": {}
}
}
}
结果:
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 0.671082,
"hits" : [
{
"_index" : "index56",
"_type" : "_doc",
"_id" : "01We63ABq1Ib1oOmkJxn",
"_score" : 0.671082,
"_source" : {
"tags" : [
{
"tag" : "lorem ipsum"
},
{
"tag" : "Lorem ipsum dolor sit amet"
}
]
},
"inner_hits" : {
"tags" : {
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 0.89999837,
"hits" : [
{
"_index" : "index56",
"_type" : "_doc",
"_id" : "01We63ABq1Ib1oOmkJxn",
"_nested" : {
"field" : "tags",
"offset" : 1
},
"_score" : 0.89999837,
"_source" : {
"tag" : "Lorem ipsum dolor sit amet"
}
},
{
"_index" : "index56",
"_type" : "_doc",
"_id" : "01We63ABq1Ib1oOmkJxn",
"_nested" : {
"field" : "tags",
"offset" : 0
},
"_score" : 0.44216567,
"_source" : {
"tag" : "lorem ipsum"
}
}
]
}
}
}
}
]
}
使用 minimum-should-match:90% 时,两个嵌套文档都在 inner_hits 中返回。 原因: 来自 docs
The number computed from the percentage is rounded down and used as the minimum.
由于 22.7 的 90% 将向下舍入为 2。因此 2 个标记应该匹配。 如果 minimum-should-match:100% 那么只会返回一个嵌套文档