正确的反斜杠字段分析器
Correct analyzer for a field with backslash
因此,我正在尝试使用正确的分析器为 Windows 凭证字段设置新的索引映射,其格式为 domain\username
.
我希望能够搜索域、用户名和 domain\username。但是默认分析器似乎忽略了反斜杠(意思是,如果我尝试搜索 domain\username 它将搜索 "domain OR username" 忽略反斜杠),如果我尝试空白分析器它似乎只匹配domain\username.
有什么建议吗?
您可以使用路径层次分词器,将反斜杠设置为分隔符 - 文档 here
尝试:
PUT my_index
{
"settings": {
"analysis": {
"analyzer": {
"custom_path_tree": {
"tokenizer": "custom_hierarchy"
},
"custom_path_tree_reversed": {
"tokenizer": "custom_hierarchy_reversed"
}
},
"tokenizer": {
"custom_hierarchy": {
"type": "path_hierarchy",
"delimiter": "\"
},
"custom_hierarchy_reversed": {
"type": "path_hierarchy",
"delimiter": "\",
"reverse": "true"
}
}
}
},
"mappings": {
"properties": {
"file_path": {
"type": "text",
"fields": {
"tree": {
"type": "text",
"analyzer": "custom_path_tree"
},
"tree_reversed": {
"type": "text",
"analyzer": "custom_path_tree_reversed"
}
}
}
}
}
}
POST my_index/_analyze
{
"analyzer": "custom_path_tree",
"text": "C:\Windows\Users"
}
因此,我正在尝试使用正确的分析器为 Windows 凭证字段设置新的索引映射,其格式为 domain\username
.
我希望能够搜索域、用户名和 domain\username。但是默认分析器似乎忽略了反斜杠(意思是,如果我尝试搜索 domain\username 它将搜索 "domain OR username" 忽略反斜杠),如果我尝试空白分析器它似乎只匹配domain\username.
有什么建议吗?
您可以使用路径层次分词器,将反斜杠设置为分隔符 - 文档 here 尝试:
PUT my_index
{
"settings": {
"analysis": {
"analyzer": {
"custom_path_tree": {
"tokenizer": "custom_hierarchy"
},
"custom_path_tree_reversed": {
"tokenizer": "custom_hierarchy_reversed"
}
},
"tokenizer": {
"custom_hierarchy": {
"type": "path_hierarchy",
"delimiter": "\"
},
"custom_hierarchy_reversed": {
"type": "path_hierarchy",
"delimiter": "\",
"reverse": "true"
}
}
}
},
"mappings": {
"properties": {
"file_path": {
"type": "text",
"fields": {
"tree": {
"type": "text",
"analyzer": "custom_path_tree"
},
"tree_reversed": {
"type": "text",
"analyzer": "custom_path_tree_reversed"
}
}
}
}
}
}
POST my_index/_analyze
{
"analyzer": "custom_path_tree",
"text": "C:\Windows\Users"
}