Kibana 中的查询不使用 Regexp return 记录

Question

我在 Elasticsearch 中有一个名为 log.file.path 的字段，它有 /var/log/dev-collateral/uaa.2020-09-26.log 值，我试图检索 log.file.path 字段以 /var/log/dev-collateral/uaa 开头的所有日志我使用了下面的正则表达式，但它不起作用。

{
    "regexp":{
        "log.file.path": "/var/log/dev-collateral/uaa.*"
    }
}

Answer 1

让我们看看为什么它不起作用？我已经使用 Kibana UI 索引了两个文档，如下所示 -

PUT myindex/_doc/1
{
  "log.file.path" : "/var/log/dev-collateral/uaa.2020-09-26.log"
}

PUT myindex/_doc/2
{
  "log.file.path" : "/var/log/dev-collateral/uaa.2020-09-26.txt"
}

当我尝试使用 _analyze API

查看 log.file.path 字段上文本的标记时

POST _analyze
{
  "text": "/var/log/dev-collateral/uaa.2020-09-26.log"
}

它给了我，

{
  "tokens" : [
    {
      "token" : "var",
      "start_offset" : 1,
      "end_offset" : 4,
      "type" : "<ALPHANUM>",
      "position" : 0
    },
    {
      "token" : "log",
      "start_offset" : 5,
      "end_offset" : 8,
      "type" : "<ALPHANUM>",
      "position" : 1
    },
    {
      "token" : "dev",
      "start_offset" : 9,
      "end_offset" : 12,
      "type" : "<ALPHANUM>",
      "position" : 2
    },
    {
      "token" : "collateral",
      "start_offset" : 13,
      "end_offset" : 23,
      "type" : "<ALPHANUM>",
      "position" : 3
    },
    {
      "token" : "uaa",
      "start_offset" : 24,
      "end_offset" : 27,
      "type" : "<ALPHANUM>",
      "position" : 4
    },
    {
      "token" : "2020",
      "start_offset" : 28,
      "end_offset" : 32,
      "type" : "<NUM>",
      "position" : 5
    },
    {
      "token" : "09",
      "start_offset" : 33,
      "end_offset" : 35,
      "type" : "<NUM>",
      "position" : 6
    },
    {
      "token" : "26",
      "start_offset" : 36,
      "end_offset" : 38,
      "type" : "<NUM>",
      "position" : 7
    },
    {
      "token" : "log",
      "start_offset" : 39,
      "end_offset" : 42,
      "type" : "<ALPHANUM>",
      "position" : 8
    }
  ]
}

您可以看到，当您将输入文本插入索引时，Elasticsearch 已将它们拆分为标记。这是因为 elasticsearch 在我们索引文档时使用 标准分析器 并且 它将我们的文档拆分为小部分作为标记，删除标点符号，小写文本等 。那就是你当前的正则表达式查询不起作用的原因。

GET myindex/_search
{
  "query": {
    "match": {
      "log.file.path": "var"
    }
  }
}

如果您尝试这种方式，它会起作用，但对于您的情况，您需要匹配每个以 .log[= 结尾的 log.file.path 45=] 那么现在怎么办？只是在索引文档时不要应用分析器。关键字类型按原样存储您提供的字符串。

创建 keyword 类型的映射，

PUT myindex2/ { "mappings": { "properties": { "log.file.path": { "type": "keyword" } } } }

索引文件，

PUT myindex2/_doc/1 { "log.file.path" : "/var/log/dev-collateral/uaa.2020-09-26.log" } PUT myindex2/_doc/2 { "log.file.path" : "/var/log/dev-collateral/uaa.2020-09-26.txt" }

搜索 regexp,

GET myindex2/_search { "query": { "regexp": { "log.file.path": "/var/log/dev-collateral/uaa.2020-09-26.*" } } }

Answer 2

我使用了这个查询，它有效！

{
  "query": {
    "regexp": {
      "log.file.path.keyword": {
        "value": "/var/log/dev-collateral/uaa.*",
        "flags": "ALL",
        "max_determinized_states": 10000,
        "rewrite": "constant_score"
      }
    }
  }
}

Kibana 中的查询不使用 Regexp return 记录

Query in Kibana doesn't return logs with Regexp

querydsl

elasticsearch

kibana