斜杠在使用 Elasticsearch 中的正则表达式匹配查询时不起作用
Slash doesn't work in matching query using regexp in Elasticsearch
In the official Elasticsearch documentation里面写着Any reserved character can be escaped with a backslash "\*" including a literal backslash character: "\"
.
你能解释一下为什么这样查询吗
{
"query": {
"bool": {
"must": [
{
"regexp": {
"path": ".*test\/test.txt.*"
}
},
{
"match": {
"user_id": 1
}
}
]
}
}
}
找不到这样的索引
{
"_index": "pictures",
"_type": "picture",
"_id": "wiskQ2kBi923Omj4U",
"_score": 1,
"_source": {
"user_id": 1,
"tag": [],
"text": "some text",
"path": "test/test.txt"
}
}
由于 path
是分析字段,正则表达式不会匹配它。原因是 test/test.txt
被标记为两个不同的术语:test
和 test.txt
。由于 path
有一个数据类型 keyword
的子字段 keyword
,它将按原样索引 test/test.txt
,您应该查询该字段即 path.keyword
.
使用以下查询:
{
"query": {
"bool": {
"must": [
{
"regexp": {
"path.keyword": ".*test/test.txt.*"
}
},
{
"match": {
"user_id": 1
}
}
]
}
}
}
In the official Elasticsearch documentation里面写着Any reserved character can be escaped with a backslash "\*" including a literal backslash character: "\"
.
你能解释一下为什么这样查询吗
{
"query": {
"bool": {
"must": [
{
"regexp": {
"path": ".*test\/test.txt.*"
}
},
{
"match": {
"user_id": 1
}
}
]
}
}
}
找不到这样的索引
{
"_index": "pictures",
"_type": "picture",
"_id": "wiskQ2kBi923Omj4U",
"_score": 1,
"_source": {
"user_id": 1,
"tag": [],
"text": "some text",
"path": "test/test.txt"
}
}
由于 path
是分析字段,正则表达式不会匹配它。原因是 test/test.txt
被标记为两个不同的术语:test
和 test.txt
。由于 path
有一个数据类型 keyword
的子字段 keyword
,它将按原样索引 test/test.txt
,您应该查询该字段即 path.keyword
.
使用以下查询:
{
"query": {
"bool": {
"must": [
{
"regexp": {
"path.keyword": ".*test/test.txt.*"
}
},
{
"match": {
"user_id": 1
}
}
]
}
}
}