Elasticsearch 部分查询
Elasticsearch partial query
我正在使用 Elasticsearch v 7.3.1 并尝试实现部分搜索。所有搜索都进行得很顺利,但是当我查询“John Oxford”时,"John" 与文档匹配,但没有“Oxford”在整个文档中。但仍然显示文档而不是显示空结果。
我该怎么做,才能在我们查询John Oxford时不return文档?
我的映射、设置、示例文档和学生数据查询如下。
映射
PUT student
{
"settings": {
"analysis": {
"filter": {
"autocomplete_filter": {
"type": "edge_ngram",
"min_gram": 1,
"max_gram": 20
}
},
"analyzer": {
"autocomplete": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"lowercase",
"autocomplete_filter"
]
}
}
}
}, "mappings" : {
"properties" : {
"DOB" : {
"type" : "text"
},
"email" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"first_name" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"home_phone" : {
"type" : "text"
},
"last_name" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"student_id" : {
"type" : "text"
}
}
}
}
示例文档
POST student/_doc
{
"DOB": "1983-12-04",
"email": "johndoe@gmail.fr",
"first_name": "john",
"home_phone": 1242432,
"last_name": "doe",
"student_id": 28
}
查询
GET student/_search
{
"query": {
"multi_match": {
"query": "john oxford",
"type": "bool_prefix",
"analyzer": "standard",
"fields": [
"first_name",
"last_name",
"email",
"DOB",
"home_phone",
"student_id"
]
}
}
}
下面是我想要的结果
- 1242 - 部分匹配 home_phone
- joh do - 部分匹配 "John" 和 "Doe"
- 1983-12-04 - 匹配出生日期
- johndoe - 电子邮件部分匹配
- doe - 匹配姓氏
要实施部分搜索,您应该将特定的 autocomplete analyzer
添加到所需的文本字段并实施特定的 search_analyzer
,因为您使用的是 edgengram
过滤器 - 请阅读 here and here 进行解释。这比在查询期间指定分析器更舒服,就像您所做的那样。尝试:
PUT student
{
"settings": {
"analysis": {
"filter": {
"autocomplete_filter": {
"type": "edge_ngram",
"min_gram": 1,
"max_gram": 20
}
},
"analyzer": {
"autocomplete": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"lowercase",
"autocomplete_filter"
]
}
}
}
}, "mappings" : {
"properties" : {
"DOB" : {
"type" : "text",
"analyzer": "autocomplete",
"search_analyzer": "standard"
},
"email" : {
"type" : "text",
"analyzer": "autocomplete",
"search_analyzer": "standard",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"first_name" : {
"type" : "text",
"analyzer": "autocomplete",
"search_analyzer": "standard",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"home_phone" : {
"type" : "text",
"analyzer": "autocomplete",
"search_analyzer": "standard"
},
"last_name" : {
"type" : "text",
"analyzer": "autocomplete",
"search_analyzer": "standard",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"student_id" : {
"type" : "text"
}
}
}
}
然后当您查询两个术语的自动完成时,您应该将它们与 and
运算符连接起来。对于您的用例 cross-field 类型应该是最好的:
GET student/_search
{
"query": {
"multi_match" : {
"query": "John Oxford",
"type": "cross_fields",
"fields": [
"first_name",
"last_name",
"email",
"DOB",
"home_phone",
"student_id"
],
"operator": "and"
}
}
}
我正在使用 Elasticsearch v 7.3.1 并尝试实现部分搜索。所有搜索都进行得很顺利,但是当我查询“John Oxford”时,"John" 与文档匹配,但没有“Oxford”在整个文档中。但仍然显示文档而不是显示空结果。
我该怎么做,才能在我们查询John Oxford时不return文档?
我的映射、设置、示例文档和学生数据查询如下。
映射
PUT student
{
"settings": {
"analysis": {
"filter": {
"autocomplete_filter": {
"type": "edge_ngram",
"min_gram": 1,
"max_gram": 20
}
},
"analyzer": {
"autocomplete": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"lowercase",
"autocomplete_filter"
]
}
}
}
}, "mappings" : {
"properties" : {
"DOB" : {
"type" : "text"
},
"email" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"first_name" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"home_phone" : {
"type" : "text"
},
"last_name" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"student_id" : {
"type" : "text"
}
}
}
}
示例文档
POST student/_doc
{
"DOB": "1983-12-04",
"email": "johndoe@gmail.fr",
"first_name": "john",
"home_phone": 1242432,
"last_name": "doe",
"student_id": 28
}
查询
GET student/_search
{
"query": {
"multi_match": {
"query": "john oxford",
"type": "bool_prefix",
"analyzer": "standard",
"fields": [
"first_name",
"last_name",
"email",
"DOB",
"home_phone",
"student_id"
]
}
}
}
下面是我想要的结果
- 1242 - 部分匹配 home_phone
- joh do - 部分匹配 "John" 和 "Doe"
- 1983-12-04 - 匹配出生日期
- johndoe - 电子邮件部分匹配
- doe - 匹配姓氏
要实施部分搜索,您应该将特定的 autocomplete analyzer
添加到所需的文本字段并实施特定的 search_analyzer
,因为您使用的是 edgengram
过滤器 - 请阅读 here and here 进行解释。这比在查询期间指定分析器更舒服,就像您所做的那样。尝试:
PUT student
{
"settings": {
"analysis": {
"filter": {
"autocomplete_filter": {
"type": "edge_ngram",
"min_gram": 1,
"max_gram": 20
}
},
"analyzer": {
"autocomplete": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"lowercase",
"autocomplete_filter"
]
}
}
}
}, "mappings" : {
"properties" : {
"DOB" : {
"type" : "text",
"analyzer": "autocomplete",
"search_analyzer": "standard"
},
"email" : {
"type" : "text",
"analyzer": "autocomplete",
"search_analyzer": "standard",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"first_name" : {
"type" : "text",
"analyzer": "autocomplete",
"search_analyzer": "standard",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"home_phone" : {
"type" : "text",
"analyzer": "autocomplete",
"search_analyzer": "standard"
},
"last_name" : {
"type" : "text",
"analyzer": "autocomplete",
"search_analyzer": "standard",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"student_id" : {
"type" : "text"
}
}
}
}
然后当您查询两个术语的自动完成时,您应该将它们与 and
运算符连接起来。对于您的用例 cross-field 类型应该是最好的:
GET student/_search
{
"query": {
"multi_match" : {
"query": "John Oxford",
"type": "cross_fields",
"fields": [
"first_name",
"last_name",
"email",
"DOB",
"home_phone",
"student_id"
],
"operator": "and"
}
}
}