弹性搜索中匹配短语查询中的单个单词是否有字符限制?
Is there a character limit on an individual word within a match phrase query in elastic search?
对 Elastic Search 还很陌生,所以可能不得不告诉我,我 运行 遇到了一个问题,如果我使用 20 个或更少的字符搜索文档,该文档会出现,但是任何更多的字符在查询中的同一个词中,我没有得到任何结果:
- 使用 'phenoxymethylpenicillin' 没有文件。
- 使用 'phenoxymethylpenicil' 返回文档。
这是我正在尝试使用的查询:
{
"match_phrase": {
"genericNames.name": {
"query": "phenoxymethylpenicillin",
"slop": 15,
"zero_terms_query": "NONE",
"boost": 1.0
}
}
}
这是完整的查询:https://pastebin.com/DEJvP2uS
就像我说的,我对此还很陌生,可能是看错了地方。
所以我的问题是,哪些可能的区域会导致这种情况,为什么?
谢谢!
编辑:
提供的是来自示例数据的文档之一的摘录。我不能展示很多,因为很多都是敏感的,幸运的是,我可以分享样本数据中的名字。这是我要搜索的数据:
"genericNames":[
{
"nameType":1,
"name":"Phenoxymethylpenicillin 250mg tablets",
"nameChangeCode":"0000",
"nameBasisCode":"0001",
"nameTypeDescription":"Name",
"startDate":"1948-01-01T00:00:00.000000+0000",
"endDate":"3456-02-01T00:00:00.000000+0000"
},
{
"nameType":5,
"name":"Penicillin V 250mg tablets",
"nameTypeDescription":"Alternative Name 3",
"startDate":"1948-01-01T00:00:00.000000+0000",
"endDate":"3456-02-01T00:00:00.000000+0000"
}
],
我还提供了索引映射,因为它可能会提供额外信息:
{
"amp": {
"mappings": {
"properties": {
"_class": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"ampId": {
"type": "long"
},
"amppId": {
"type": "long"
},
"attributes": {
"type": "nested",
"properties": {
"attributeQualifier": {
"type": "keyword"
},
"attributeType": {
"type": "integer"
},
"attributeTypeDescription": {
"type": "keyword"
},
"attributeValue": {
"type": "text",
"fields": {
"raw": {
"type": "keyword"
}
}
},
"countryId": {
"type": "long"
},
"decodedValue": {
"type": "text",
"fields": {
"raw": {
"type": "keyword"
}
}
},
"endDate": {
"type": "date",
"format": "uuuu-MM-dd'T'HH:mm:ss.SSSSSSZ"
},
"startDate": {
"type": "date",
"format": "uuuu-MM-dd'T'HH:mm:ss.SSSSSSZ"
}
}
},
"dictionaries": {
"type": "nested",
"properties": {
"abbreviation": {
"type": "text",
"fields": {
"raw": {
"type": "keyword"
}
}
},
"description": {
"type": "text",
"fields": {
"raw": {
"type": "keyword"
}
}
},
"dictId": {
"type": "integer"
},
"endDate": {
"type": "date",
"format": "uuuu-MM-dd'T'HH:mm:ss.SSSSSSZ"
},
"startDate": {
"type": "date",
"format": "uuuu-MM-dd'T'HH:mm:ss.SSSSSSZ"
}
}
},
"endDate": {
"type": "date",
"format": "uuuu-MM-dd'T'HH:mm:ss.SSSSSSZ"
},
"excipients": {
"type": "nested",
"properties": {
"basisOfStrengthCode": {
"type": "keyword"
},
"bossId": {
"type": "long"
},
"endDate": {
"type": "date",
"format": "uuuu-MM-dd'T'HH:mm:ss.SSSSSSZ"
},
"id": {
"type": "long"
},
"ingredientNames": {
"properties": {
"endDate": {
"type": "date"
},
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"startDate": {
"type": "date"
}
}
},
"startDate": {
"type": "date",
"format": "uuuu-MM-dd'T'HH:mm:ss.SSSSSSZ"
},
"strengthDenominatorUnitOfMeasureCode": {
"type": "keyword"
},
"strengthDenominatorValue": {
"type": "keyword"
},
"strengthNumeratorUnitOfMeasureCode": {
"type": "keyword"
},
"strengthNumeratorValue": {
"type": "keyword"
},
"strengthVal": {
"type": "keyword"
},
"unitOfMeasure": {
"type": "keyword"
}
}
},
"extractableEntry": {
"type": "boolean"
},
"genericNames": {
"type": "nested",
"properties": {
"endDate": {
"type": "date",
"format": "uuuu-MM-dd'T'HH:mm:ss.SSSSSSZ"
},
"name": {
"type": "text",
"ignore_above": 256,
"fields": {
"raw": {
"type": "keyword"
}
},
"analyzer": "autocomplete_index",
"search_analyzer": "autocomplete_search"
},
"nameBasisCode": {
"type": "keyword"
},
"nameChangeCode": {
"type": "keyword"
},
"nameType": {
"type": "integer"
},
"nameTypeDescription": {
"type": "text",
"fields": {
"raw": {
"type": "keyword"
}
}
},
"startDate": {
"type": "date",
"format": "uuuu-MM-dd'T'HH:mm:ss.SSSSSSZ"
}
}
},
"id": {
"type": "keyword"
},
"ingredients": {
"type": "nested",
"properties": {
"basisOfStrengthCode": {
"type": "keyword"
},
"bossId": {
"type": "long"
},
"endDate": {
"type": "date",
"format": "uuuu-MM-dd'T'HH:mm:ss.SSSSSSZ"
},
"id": {
"type": "long"
},
"ingredientNames": {
"properties": {
"endDate": {
"type": "date"
},
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"startDate": {
"type": "date"
}
}
},
"startDate": {
"type": "date",
"format": "uuuu-MM-dd'T'HH:mm:ss.SSSSSSZ"
},
"strengthDenominatorUnitOfMeasureCode": {
"type": "keyword"
},
"strengthDenominatorValue": {
"type": "keyword"
},
"strengthNumeratorUnitOfMeasureCode": {
"type": "keyword"
},
"strengthNumeratorValue": {
"type": "keyword"
},
"strengthVal": {
"type": "keyword"
},
"unitOfMeasure": {
"type": "keyword"
}
}
},
"invalidEntry": {
"type": "boolean"
},
"pitId": {
"type": "integer"
},
"ppaCodes": {
"type": "nested",
"properties": {
"code": {
"type": "text",
"fields": {
"raw": {
"type": "keyword"
}
}
},
"endDate": {
"type": "date",
"format": "uuuu-MM-dd'T'HH:mm:ss.SSSSSSZ"
},
"startDate": {
"type": "date",
"format": "uuuu-MM-dd'T'HH:mm:ss.SSSSSSZ"
}
}
},
"proprietaryNames": {
"type": "nested",
"properties": {
"endDate": {
"type": "date",
"format": "uuuu-MM-dd'T'HH:mm:ss.SSSSSSZ"
},
"name": {
"type": "text",
"ignore_above": 256,
"fields": {
"raw": {
"type": "keyword"
}
},
"analyzer": "autocomplete_index",
"search_analyzer": "autocomplete_search"
},
"nameBasisCode": {
"type": "keyword"
},
"nameChangeCode": {
"type": "keyword"
},
"nameType": {
"type": "integer"
},
"nameTypeDescription": {
"type": "text",
"fields": {
"raw": {
"type": "keyword"
}
}
},
"startDate": {
"type": "date",
"format": "uuuu-MM-dd'T'HH:mm:ss.SSSSSSZ"
}
}
},
"qpuUomCde": {
"type": "keyword"
},
"qpuVal": {
"type": "keyword"
},
"qtyUomCde": {
"type": "keyword"
},
"qtyVal": {
"type": "keyword"
},
"snomedCodes": {
"type": "nested",
"properties": {
"endDate": {
"type": "date",
"format": "uuuu-MM-dd'T'HH:mm:ss.SSSSSSZ"
},
"ppaNextNo": {
"type": "text",
"fields": {
"raw": {
"type": "keyword"
}
}
},
"snomed": {
"type": "text",
"fields": {
"raw": {
"type": "keyword"
}
}
},
"startDate": {
"type": "date",
"format": "uuuu-MM-dd'T'HH:mm:ss.SSSSSSZ"
}
}
},
"snomedDescriptions": {
"type": "nested",
"properties": {
"endDate": {
"type": "date",
"format": "uuuu-MM-dd'T'HH:mm:ss.SSSSSSZ"
},
"ppaNextNo": {
"type": "text",
"fields": {
"raw": {
"type": "keyword"
}
}
},
"snomed": {
"type": "text",
"fields": {
"raw": {
"type": "keyword"
}
}
},
"startDate": {
"type": "date",
"format": "uuuu-MM-dd'T'HH:mm:ss.SSSSSSZ"
}
}
},
"startDate": {
"type": "date",
"format": "uuuu-MM-dd'T'HH:mm:ss.SSSSSSZ"
},
"suppliers": {
"type": "nested",
"properties": {
"endDate": {
"type": "date",
"format": "uuuu-MM-dd'T'HH:mm:ss.SSSSSSZ"
},
"id": {
"type": "long"
},
"names": {
"type": "nested",
"properties": {
"endDate": {
"type": "date",
"format": "uuuu-MM-dd'T'HH:mm:ss.SSSSSSZ"
},
"name": {
"type": "text",
"fields": {
"raw": {
"type": "keyword"
}
},
"analyzer": "autocomplete_index",
"search_analyzer": "autocomplete_search"
},
"nameBasisCode": {
"type": "keyword"
},
"nameChangeCode": {
"type": "keyword"
},
"nameType": {
"type": "integer"
},
"nameTypeDescription": {
"type": "text",
"fields": {
"raw": {
"type": "keyword"
}
}
},
"startDate": {
"type": "date",
"format": "uuuu-MM-dd'T'HH:mm:ss.SSSSSSZ"
}
}
},
"startDate": {
"type": "date",
"format": "uuuu-MM-dd'T'HH:mm:ss.SSSSSSZ"
}
}
},
"udfs": {
"type": "nested",
"properties": {
"ddIndicator": {
"type": "integer"
},
"endDate": {
"type": "date",
"format": "uuuu-MM-dd'T'HH:mm:ss.SSSSSSZ"
},
"startDate": {
"type": "date",
"format": "uuuu-MM-dd'T'HH:mm:ss.SSSSSSZ"
},
"udfsUomCode": {
"type": "keyword"
},
"udfsValue": {
"type": "keyword"
},
"vmpUomCode": {
"type": "keyword"
}
}
},
"vmpId": {
"type": "long"
},
"vmppId": {
"type": "long"
},
"vtms": {
"type": "nested",
"properties": {
"endDate": {
"type": "date",
"format": "uuuu-MM-dd'T'HH:mm:ss.SSSSSSZ"
},
"id": {
"type": "long"
},
"startDate": {
"type": "date",
"format": "uuuu-MM-dd'T'HH:mm:ss.SSSSSSZ"
}
}
}
}
}
}
}
编辑:将 link 添加到完整查询 - https://pastebin.com/DEJvP2uS
编辑:索引设置:
{
"index": {
"max_ngram_diff": "20",
"analysis": {
"filter": {
"autocomplete_suffix_filter": {
"type": "ngram",
"min_gram": "1",
"max_gram": "20"
},
"autocomplete_filter": {
"type": "edge_ngram",
"min_gram": "1",
"max_gram": "20"
}
},
"analyzer": {
"autocomplete_index": {
"filter": [
"lowercase",
"autocomplete_filter",
"autocomplete_suffix_filter"
],
"type": "custom",
"tokenizer": "standard"
},
"autocomplete_search": {
"filter": [
"lowercase"
],
"type": "custom",
"tokenizer": "standard"
}
}
},
"number_of_replicas": "1"
}
}
上面提供的索引映射中,genericNames
属于嵌套类型,所以需要使用nested query
使用上面提供的相同索引数据以及搜索查询和搜索结果添加工作示例。
搜索查询:
{
"query": {
"nested": {
"path": "genericNames",
"query": {
"bool": {
"must": [
{
"match": {
"genericNames.name": "phenoxymethylpenicillin"
}
}
]
}
},
"inner_hits":{}
}
}
}
搜索结果:
"hits": [
{
"_index": "64817981",
"_type": "_doc",
"_id": "1",
"_nested": {
"field": "genericNames",
"offset": 0
},
"_score": 0.7361701,
"_source": {
"nameType": 1,
"name": "Phenoxymethylpenicillin 250mg tablets",
"nameChangeCode": "0000",
"nameBasisCode": "0001",
"nameTypeDescription": "Name",
"startDate": "1948-01-01T00:00:00.000000+0000",
"endDate": "3456-02-01T00:00:00.000000+0000"
}
}
]
这一定是由于您在 genericNames.name
字段上的自定义分析器造成的,您有不同的自定义分析器,您使用 autocomplete_index
的索引时间和搜索时间 autocomplete_search
分析器,但是问题中没有提供这些分析器的定义,只提供了mapping
部分。
请在您的索引中提供 _setting
API 的输出,请参阅 https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-get-settings.html 了解更多信息。
您需要使用 analyze API 为 autocomplete_index
和 autocomplete_search
分析器检查为 phenoxymethylpenicillin
生成的令牌,您会注意到差异。
对 Elastic Search 还很陌生,所以可能不得不告诉我,我 运行 遇到了一个问题,如果我使用 20 个或更少的字符搜索文档,该文档会出现,但是任何更多的字符在查询中的同一个词中,我没有得到任何结果:
- 使用 'phenoxymethylpenicillin' 没有文件。
- 使用 'phenoxymethylpenicil' 返回文档。
这是我正在尝试使用的查询:
{
"match_phrase": {
"genericNames.name": {
"query": "phenoxymethylpenicillin",
"slop": 15,
"zero_terms_query": "NONE",
"boost": 1.0
}
}
}
这是完整的查询:https://pastebin.com/DEJvP2uS
就像我说的,我对此还很陌生,可能是看错了地方。
所以我的问题是,哪些可能的区域会导致这种情况,为什么?
谢谢!
编辑: 提供的是来自示例数据的文档之一的摘录。我不能展示很多,因为很多都是敏感的,幸运的是,我可以分享样本数据中的名字。这是我要搜索的数据:
"genericNames":[
{
"nameType":1,
"name":"Phenoxymethylpenicillin 250mg tablets",
"nameChangeCode":"0000",
"nameBasisCode":"0001",
"nameTypeDescription":"Name",
"startDate":"1948-01-01T00:00:00.000000+0000",
"endDate":"3456-02-01T00:00:00.000000+0000"
},
{
"nameType":5,
"name":"Penicillin V 250mg tablets",
"nameTypeDescription":"Alternative Name 3",
"startDate":"1948-01-01T00:00:00.000000+0000",
"endDate":"3456-02-01T00:00:00.000000+0000"
}
],
我还提供了索引映射,因为它可能会提供额外信息:
{
"amp": {
"mappings": {
"properties": {
"_class": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"ampId": {
"type": "long"
},
"amppId": {
"type": "long"
},
"attributes": {
"type": "nested",
"properties": {
"attributeQualifier": {
"type": "keyword"
},
"attributeType": {
"type": "integer"
},
"attributeTypeDescription": {
"type": "keyword"
},
"attributeValue": {
"type": "text",
"fields": {
"raw": {
"type": "keyword"
}
}
},
"countryId": {
"type": "long"
},
"decodedValue": {
"type": "text",
"fields": {
"raw": {
"type": "keyword"
}
}
},
"endDate": {
"type": "date",
"format": "uuuu-MM-dd'T'HH:mm:ss.SSSSSSZ"
},
"startDate": {
"type": "date",
"format": "uuuu-MM-dd'T'HH:mm:ss.SSSSSSZ"
}
}
},
"dictionaries": {
"type": "nested",
"properties": {
"abbreviation": {
"type": "text",
"fields": {
"raw": {
"type": "keyword"
}
}
},
"description": {
"type": "text",
"fields": {
"raw": {
"type": "keyword"
}
}
},
"dictId": {
"type": "integer"
},
"endDate": {
"type": "date",
"format": "uuuu-MM-dd'T'HH:mm:ss.SSSSSSZ"
},
"startDate": {
"type": "date",
"format": "uuuu-MM-dd'T'HH:mm:ss.SSSSSSZ"
}
}
},
"endDate": {
"type": "date",
"format": "uuuu-MM-dd'T'HH:mm:ss.SSSSSSZ"
},
"excipients": {
"type": "nested",
"properties": {
"basisOfStrengthCode": {
"type": "keyword"
},
"bossId": {
"type": "long"
},
"endDate": {
"type": "date",
"format": "uuuu-MM-dd'T'HH:mm:ss.SSSSSSZ"
},
"id": {
"type": "long"
},
"ingredientNames": {
"properties": {
"endDate": {
"type": "date"
},
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"startDate": {
"type": "date"
}
}
},
"startDate": {
"type": "date",
"format": "uuuu-MM-dd'T'HH:mm:ss.SSSSSSZ"
},
"strengthDenominatorUnitOfMeasureCode": {
"type": "keyword"
},
"strengthDenominatorValue": {
"type": "keyword"
},
"strengthNumeratorUnitOfMeasureCode": {
"type": "keyword"
},
"strengthNumeratorValue": {
"type": "keyword"
},
"strengthVal": {
"type": "keyword"
},
"unitOfMeasure": {
"type": "keyword"
}
}
},
"extractableEntry": {
"type": "boolean"
},
"genericNames": {
"type": "nested",
"properties": {
"endDate": {
"type": "date",
"format": "uuuu-MM-dd'T'HH:mm:ss.SSSSSSZ"
},
"name": {
"type": "text",
"ignore_above": 256,
"fields": {
"raw": {
"type": "keyword"
}
},
"analyzer": "autocomplete_index",
"search_analyzer": "autocomplete_search"
},
"nameBasisCode": {
"type": "keyword"
},
"nameChangeCode": {
"type": "keyword"
},
"nameType": {
"type": "integer"
},
"nameTypeDescription": {
"type": "text",
"fields": {
"raw": {
"type": "keyword"
}
}
},
"startDate": {
"type": "date",
"format": "uuuu-MM-dd'T'HH:mm:ss.SSSSSSZ"
}
}
},
"id": {
"type": "keyword"
},
"ingredients": {
"type": "nested",
"properties": {
"basisOfStrengthCode": {
"type": "keyword"
},
"bossId": {
"type": "long"
},
"endDate": {
"type": "date",
"format": "uuuu-MM-dd'T'HH:mm:ss.SSSSSSZ"
},
"id": {
"type": "long"
},
"ingredientNames": {
"properties": {
"endDate": {
"type": "date"
},
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"startDate": {
"type": "date"
}
}
},
"startDate": {
"type": "date",
"format": "uuuu-MM-dd'T'HH:mm:ss.SSSSSSZ"
},
"strengthDenominatorUnitOfMeasureCode": {
"type": "keyword"
},
"strengthDenominatorValue": {
"type": "keyword"
},
"strengthNumeratorUnitOfMeasureCode": {
"type": "keyword"
},
"strengthNumeratorValue": {
"type": "keyword"
},
"strengthVal": {
"type": "keyword"
},
"unitOfMeasure": {
"type": "keyword"
}
}
},
"invalidEntry": {
"type": "boolean"
},
"pitId": {
"type": "integer"
},
"ppaCodes": {
"type": "nested",
"properties": {
"code": {
"type": "text",
"fields": {
"raw": {
"type": "keyword"
}
}
},
"endDate": {
"type": "date",
"format": "uuuu-MM-dd'T'HH:mm:ss.SSSSSSZ"
},
"startDate": {
"type": "date",
"format": "uuuu-MM-dd'T'HH:mm:ss.SSSSSSZ"
}
}
},
"proprietaryNames": {
"type": "nested",
"properties": {
"endDate": {
"type": "date",
"format": "uuuu-MM-dd'T'HH:mm:ss.SSSSSSZ"
},
"name": {
"type": "text",
"ignore_above": 256,
"fields": {
"raw": {
"type": "keyword"
}
},
"analyzer": "autocomplete_index",
"search_analyzer": "autocomplete_search"
},
"nameBasisCode": {
"type": "keyword"
},
"nameChangeCode": {
"type": "keyword"
},
"nameType": {
"type": "integer"
},
"nameTypeDescription": {
"type": "text",
"fields": {
"raw": {
"type": "keyword"
}
}
},
"startDate": {
"type": "date",
"format": "uuuu-MM-dd'T'HH:mm:ss.SSSSSSZ"
}
}
},
"qpuUomCde": {
"type": "keyword"
},
"qpuVal": {
"type": "keyword"
},
"qtyUomCde": {
"type": "keyword"
},
"qtyVal": {
"type": "keyword"
},
"snomedCodes": {
"type": "nested",
"properties": {
"endDate": {
"type": "date",
"format": "uuuu-MM-dd'T'HH:mm:ss.SSSSSSZ"
},
"ppaNextNo": {
"type": "text",
"fields": {
"raw": {
"type": "keyword"
}
}
},
"snomed": {
"type": "text",
"fields": {
"raw": {
"type": "keyword"
}
}
},
"startDate": {
"type": "date",
"format": "uuuu-MM-dd'T'HH:mm:ss.SSSSSSZ"
}
}
},
"snomedDescriptions": {
"type": "nested",
"properties": {
"endDate": {
"type": "date",
"format": "uuuu-MM-dd'T'HH:mm:ss.SSSSSSZ"
},
"ppaNextNo": {
"type": "text",
"fields": {
"raw": {
"type": "keyword"
}
}
},
"snomed": {
"type": "text",
"fields": {
"raw": {
"type": "keyword"
}
}
},
"startDate": {
"type": "date",
"format": "uuuu-MM-dd'T'HH:mm:ss.SSSSSSZ"
}
}
},
"startDate": {
"type": "date",
"format": "uuuu-MM-dd'T'HH:mm:ss.SSSSSSZ"
},
"suppliers": {
"type": "nested",
"properties": {
"endDate": {
"type": "date",
"format": "uuuu-MM-dd'T'HH:mm:ss.SSSSSSZ"
},
"id": {
"type": "long"
},
"names": {
"type": "nested",
"properties": {
"endDate": {
"type": "date",
"format": "uuuu-MM-dd'T'HH:mm:ss.SSSSSSZ"
},
"name": {
"type": "text",
"fields": {
"raw": {
"type": "keyword"
}
},
"analyzer": "autocomplete_index",
"search_analyzer": "autocomplete_search"
},
"nameBasisCode": {
"type": "keyword"
},
"nameChangeCode": {
"type": "keyword"
},
"nameType": {
"type": "integer"
},
"nameTypeDescription": {
"type": "text",
"fields": {
"raw": {
"type": "keyword"
}
}
},
"startDate": {
"type": "date",
"format": "uuuu-MM-dd'T'HH:mm:ss.SSSSSSZ"
}
}
},
"startDate": {
"type": "date",
"format": "uuuu-MM-dd'T'HH:mm:ss.SSSSSSZ"
}
}
},
"udfs": {
"type": "nested",
"properties": {
"ddIndicator": {
"type": "integer"
},
"endDate": {
"type": "date",
"format": "uuuu-MM-dd'T'HH:mm:ss.SSSSSSZ"
},
"startDate": {
"type": "date",
"format": "uuuu-MM-dd'T'HH:mm:ss.SSSSSSZ"
},
"udfsUomCode": {
"type": "keyword"
},
"udfsValue": {
"type": "keyword"
},
"vmpUomCode": {
"type": "keyword"
}
}
},
"vmpId": {
"type": "long"
},
"vmppId": {
"type": "long"
},
"vtms": {
"type": "nested",
"properties": {
"endDate": {
"type": "date",
"format": "uuuu-MM-dd'T'HH:mm:ss.SSSSSSZ"
},
"id": {
"type": "long"
},
"startDate": {
"type": "date",
"format": "uuuu-MM-dd'T'HH:mm:ss.SSSSSSZ"
}
}
}
}
}
}
}
编辑:将 link 添加到完整查询 - https://pastebin.com/DEJvP2uS
编辑:索引设置:
{
"index": {
"max_ngram_diff": "20",
"analysis": {
"filter": {
"autocomplete_suffix_filter": {
"type": "ngram",
"min_gram": "1",
"max_gram": "20"
},
"autocomplete_filter": {
"type": "edge_ngram",
"min_gram": "1",
"max_gram": "20"
}
},
"analyzer": {
"autocomplete_index": {
"filter": [
"lowercase",
"autocomplete_filter",
"autocomplete_suffix_filter"
],
"type": "custom",
"tokenizer": "standard"
},
"autocomplete_search": {
"filter": [
"lowercase"
],
"type": "custom",
"tokenizer": "standard"
}
}
},
"number_of_replicas": "1"
}
}
上面提供的索引映射中,genericNames
属于嵌套类型,所以需要使用nested query
使用上面提供的相同索引数据以及搜索查询和搜索结果添加工作示例。
搜索查询:
{
"query": {
"nested": {
"path": "genericNames",
"query": {
"bool": {
"must": [
{
"match": {
"genericNames.name": "phenoxymethylpenicillin"
}
}
]
}
},
"inner_hits":{}
}
}
}
搜索结果:
"hits": [
{
"_index": "64817981",
"_type": "_doc",
"_id": "1",
"_nested": {
"field": "genericNames",
"offset": 0
},
"_score": 0.7361701,
"_source": {
"nameType": 1,
"name": "Phenoxymethylpenicillin 250mg tablets",
"nameChangeCode": "0000",
"nameBasisCode": "0001",
"nameTypeDescription": "Name",
"startDate": "1948-01-01T00:00:00.000000+0000",
"endDate": "3456-02-01T00:00:00.000000+0000"
}
}
]
这一定是由于您在 genericNames.name
字段上的自定义分析器造成的,您有不同的自定义分析器,您使用 autocomplete_index
的索引时间和搜索时间 autocomplete_search
分析器,但是问题中没有提供这些分析器的定义,只提供了mapping
部分。
请在您的索引中提供 _setting
API 的输出,请参阅 https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-get-settings.html 了解更多信息。
您需要使用 analyze API 为 autocomplete_index
和 autocomplete_search
分析器检查为 phenoxymethylpenicillin
生成的令牌,您会注意到差异。