Elasticsearch 在不应该匹配的时候匹配子串
Elasticsearch is matching substrings when it shouldn't be
这是我的get请求运行:
curl -XGET https://search-mycluster-xxxxxxxxxxxxxxxxxxxxxx.us-east-2.es.amazonaws.com/test/_search? -d '
{ "size": 1,
"_source": ["url"],
"query" :
{ "match": { "circular": "NC-xx-xxxx-y"} }
}
'
它工作正常,如果我将通知从 "NC-xx-xxxx-y" 更改为 "doesntwork",则不会按预期返回任何结果。如果我将其更改为 "NC"(原始字符串的子字符串),则会出现 "NC-xx-xxxx-y" 的结果。即使我使通告 "NC-xx-xxxx-ya" 结果为 "NC-xx-xxxx-y"。我只希望在通告恰好为 "NC-xx-xxxx-y" 时查询有效。对如何更改此查询有任何想法吗?
这是我的映射:
{
"test" : {
"mappings" : {
"files" : {
"properties" : {
"submitted_by" : {
"fields" : {
"raw" : {
"type" : "keyword"
}
},
"type" : "text"
},
"effective_date" : {
"type" : "date",
"fields" : {
"raw" : {
"type" : "date"
}
}
},
"date_filed" : {
"type" : "date",
"fields" : {
"raw" : {
"type" : "date"
}
}
},
"date" : {
"fields" : {
"keyword" : {
"ignore_above" : 256,
"type" : "keyword"
}
},
"type" : "text"
},
"form" : {
"fields" : {
"raw" : {
"type" : "keyword"
}
},
"type" : "text"
},
"topic" : {
"fields" : {
"raw" : {
"type" : "keyword"
}
},
"type" : "text"
},
"circular" : {
"fields" : {
"raw" : {
"type" : "keyword"
}
},
"type" : "text"
},
"profit_center" : {
"type" : "text",
"fields" : {
"raw" : {
"type" : "keyword"
}
}
},
"url" : {
"fields" : {
"raw" : {
"type" : "keyword"
}
},
"type" : "text"
},
"circular_link" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"subject" : {
"type" : "text",
"fields" : {
"raw" : {
"type" : "keyword"
}
}
},
"form_title" : {
"type" : "text",
"fields" : {
"raw" : {
"type" : "keyword"
}
}
},
"state" : {
"fields" : {
"raw" : {
"type" : "keyword"
}
},
"type" : "text"
},
"lob" : {
"fields" : {
"raw" : {
"type" : "keyword"
}
},
"type" : "text"
},
"contractor" : {
"fields" : {
"raw" : {
"type" : "keyword"
}
},
"type" : "text"
},
"link_rate_form" : {
"type" : "text",
"fields" : {
"raw" : {
"type" : "keyword"
}
}
},
"product_filing" : {
"type" : "text",
"fields" : {
"raw" : {
"type" : "keyword"
}
}
},
"status" : {
"fields" : {
"raw" : {
"type" : "keyword"
}
},
"type" : "text"
},
"date_entered" : {
"type" : "date",
"fields" : {
"raw" : {
"type" : "date"
}
}
}
}
}
}
}
}
您在索引和搜索中使用标准分析器,因此您的搜索词 "NC-xx-xxxx-y" 被分解为 ["nc","xx","xxxx","y ]。如果你想要完全匹配,你可以使用术语查询。如果将它用于过滤查询,它应该会快一点。
{
"query": {
"constant_score": {
"filter": {
"term": { "circular.raw" : "NC-xx-xxxx-y" }
}
}
}
}
编辑:在通告中添加了 .raw 以搜索关键字。
https://www.elastic.co/guide/en/elasticsearch/guide/current/term-vs-full-text.html
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-term-query.html
这是我的get请求运行:
curl -XGET https://search-mycluster-xxxxxxxxxxxxxxxxxxxxxx.us-east-2.es.amazonaws.com/test/_search? -d '
{ "size": 1,
"_source": ["url"],
"query" :
{ "match": { "circular": "NC-xx-xxxx-y"} }
}
'
它工作正常,如果我将通知从 "NC-xx-xxxx-y" 更改为 "doesntwork",则不会按预期返回任何结果。如果我将其更改为 "NC"(原始字符串的子字符串),则会出现 "NC-xx-xxxx-y" 的结果。即使我使通告 "NC-xx-xxxx-ya" 结果为 "NC-xx-xxxx-y"。我只希望在通告恰好为 "NC-xx-xxxx-y" 时查询有效。对如何更改此查询有任何想法吗?
这是我的映射:
{
"test" : {
"mappings" : {
"files" : {
"properties" : {
"submitted_by" : {
"fields" : {
"raw" : {
"type" : "keyword"
}
},
"type" : "text"
},
"effective_date" : {
"type" : "date",
"fields" : {
"raw" : {
"type" : "date"
}
}
},
"date_filed" : {
"type" : "date",
"fields" : {
"raw" : {
"type" : "date"
}
}
},
"date" : {
"fields" : {
"keyword" : {
"ignore_above" : 256,
"type" : "keyword"
}
},
"type" : "text"
},
"form" : {
"fields" : {
"raw" : {
"type" : "keyword"
}
},
"type" : "text"
},
"topic" : {
"fields" : {
"raw" : {
"type" : "keyword"
}
},
"type" : "text"
},
"circular" : {
"fields" : {
"raw" : {
"type" : "keyword"
}
},
"type" : "text"
},
"profit_center" : {
"type" : "text",
"fields" : {
"raw" : {
"type" : "keyword"
}
}
},
"url" : {
"fields" : {
"raw" : {
"type" : "keyword"
}
},
"type" : "text"
},
"circular_link" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"subject" : {
"type" : "text",
"fields" : {
"raw" : {
"type" : "keyword"
}
}
},
"form_title" : {
"type" : "text",
"fields" : {
"raw" : {
"type" : "keyword"
}
}
},
"state" : {
"fields" : {
"raw" : {
"type" : "keyword"
}
},
"type" : "text"
},
"lob" : {
"fields" : {
"raw" : {
"type" : "keyword"
}
},
"type" : "text"
},
"contractor" : {
"fields" : {
"raw" : {
"type" : "keyword"
}
},
"type" : "text"
},
"link_rate_form" : {
"type" : "text",
"fields" : {
"raw" : {
"type" : "keyword"
}
}
},
"product_filing" : {
"type" : "text",
"fields" : {
"raw" : {
"type" : "keyword"
}
}
},
"status" : {
"fields" : {
"raw" : {
"type" : "keyword"
}
},
"type" : "text"
},
"date_entered" : {
"type" : "date",
"fields" : {
"raw" : {
"type" : "date"
}
}
}
}
}
}
}
}
您在索引和搜索中使用标准分析器,因此您的搜索词 "NC-xx-xxxx-y" 被分解为 ["nc","xx","xxxx","y ]。如果你想要完全匹配,你可以使用术语查询。如果将它用于过滤查询,它应该会快一点。
{
"query": {
"constant_score": {
"filter": {
"term": { "circular.raw" : "NC-xx-xxxx-y" }
}
}
}
}
编辑:在通告中添加了 .raw 以搜索关键字。
https://www.elastic.co/guide/en/elasticsearch/guide/current/term-vs-full-text.html https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-term-query.html