Elasticsearch 部分匹配邮寄地址和客户编号
Elasticsearch partial matching on postal address and customer number
我正在尝试将搜索词与给定模式部分匹配以进行自动完成。我希望 customerNumber、AddressLine1 和 Zip 匹配以 419 开头的任何文档(因此 4191 应匹配客户编号 41915678、地址 4191 Board Street 和邮政编码 41912)
"mappings": {
"companyName": {
"type": "text"
},
"customerNumber": {
"type": "long"
}
"address": {
"addressLine1": {
"type": "text"
},
"city": {
"type": "text"
},
"state": {
"type": "text"
},
"zip": {
"type": "text"
}
}
}
有人对查询有一个巧妙的解决方案吗?最终我需要使用 NEST 客户端将此查询转换为 C#。
一个简单的方法是利用 completion
suggester field type.
基本上,您可以通过在您的映射中添加一个completion
field来修改您的映射,例如
"suggest": {
"type": "completion"
},
但是,完成字段的默认分析器(即 simple
analyzer)不索引数字,我们需要创建我们的自定义分析器来正确执行此操作:
PUT my-index
{
"settings": {
"analysis": {
"analyzer": {
"suggest_analyzer": { <--- custom analyzer
"type": "custom",
"tokenizer": "classic",
"filter": [
"lowercase"
]
}
}
}
},
"mappings": {
"properties": {
...,
"suggest": { <--- the new completion field with the right analyzer
"type": "completion",
"analyzer": "suggest_analyzer"
}
}
}
}
然后您只需在建议字段中添加您想要建议的所有值来填充索引,如下所示:
PUT my-index/_doc/1
{
"address": {
"addressLine1": "1234 Main Street",
"zip": "34526"
},
"customerNumber": "41915678",
"suggest": [
"1234 Main Street",
"34526",
"41915678"
]
}
PUT my-index/_doc/2
{
"address": {
"addressLine1": "4191 Board Street",
"zip": "45263"
},
"customerNumber": "45267742",
"suggest": [
"4191 Board Street",
"45263",
"45267742"
]
}
PUT my-index/_doc/3
{
"address": {
"addressLine1": "5662 4th Avenue",
"zip": "41912"
},
"customerNumber": "24442561",
"suggest": [
"5662 4th Avenue",
"41912",
"24442561"
]
}
然后,您可以使用以下建议查询搜索 419
:
POST my-index/_search
{
"suggest": {
"customer-suggest": {
"prefix": "419",
"completion": {
"field": "suggest"
}
}
}
}
您将获得所有三个文档,因为每个文档都有一个匹配 419
的字段
我正在尝试将搜索词与给定模式部分匹配以进行自动完成。我希望 customerNumber、AddressLine1 和 Zip 匹配以 419 开头的任何文档(因此 4191 应匹配客户编号 41915678、地址 4191 Board Street 和邮政编码 41912)
"mappings": {
"companyName": {
"type": "text"
},
"customerNumber": {
"type": "long"
}
"address": {
"addressLine1": {
"type": "text"
},
"city": {
"type": "text"
},
"state": {
"type": "text"
},
"zip": {
"type": "text"
}
}
}
有人对查询有一个巧妙的解决方案吗?最终我需要使用 NEST 客户端将此查询转换为 C#。
一个简单的方法是利用 completion
suggester field type.
基本上,您可以通过在您的映射中添加一个completion
field来修改您的映射,例如
"suggest": {
"type": "completion"
},
但是,完成字段的默认分析器(即 simple
analyzer)不索引数字,我们需要创建我们的自定义分析器来正确执行此操作:
PUT my-index
{
"settings": {
"analysis": {
"analyzer": {
"suggest_analyzer": { <--- custom analyzer
"type": "custom",
"tokenizer": "classic",
"filter": [
"lowercase"
]
}
}
}
},
"mappings": {
"properties": {
...,
"suggest": { <--- the new completion field with the right analyzer
"type": "completion",
"analyzer": "suggest_analyzer"
}
}
}
}
然后您只需在建议字段中添加您想要建议的所有值来填充索引,如下所示:
PUT my-index/_doc/1
{
"address": {
"addressLine1": "1234 Main Street",
"zip": "34526"
},
"customerNumber": "41915678",
"suggest": [
"1234 Main Street",
"34526",
"41915678"
]
}
PUT my-index/_doc/2
{
"address": {
"addressLine1": "4191 Board Street",
"zip": "45263"
},
"customerNumber": "45267742",
"suggest": [
"4191 Board Street",
"45263",
"45267742"
]
}
PUT my-index/_doc/3
{
"address": {
"addressLine1": "5662 4th Avenue",
"zip": "41912"
},
"customerNumber": "24442561",
"suggest": [
"5662 4th Avenue",
"41912",
"24442561"
]
}
然后,您可以使用以下建议查询搜索 419
:
POST my-index/_search
{
"suggest": {
"customer-suggest": {
"prefix": "419",
"completion": {
"field": "suggest"
}
}
}
}
您将获得所有三个文档,因为每个文档都有一个匹配 419