遍历数组的 Kibana 脚本字段

Kibana scripted field which loops through an array

我正在尝试使用 metricbeat http 模块来监控 F5 矿池。

我向 f5 api 发出请求并带回 json,它已保存到 kibana。但是 json 包含一个池成员数组,我想计算其中的数目。

建议似乎可以使用脚本字段来完成。但是,我无法获取检索数组的脚本。例如

doc['http.f5pools.items.monitor'].value.length()

returns 在预览结果中添加了相同的 'Additional Field' 以进行比较:

[
 {
  "_id": "rT7wdGsBXQSGm_pQoH6Y",
  "http": {
   "f5pools": {
    "items": [
     {
      "monitor": "default"
     },
     {
      "monitor": "default"
     }
    ]
   }
  },
  "pool.MemberCount": [
   7
  ]
 },

如果我尝试

doc['http.f5pools.items']

或类似的我只是得到一个错误:

"reason": "No field found for [http.f5pools.items] in mapping with types []"

谷歌搜索表明 doc 构造不包含数组?

  1. 是否可以创建一个可以访问值集的脚本字段? ie 是我的代码还是我索引数据的方式有误。
  2. 如果没有,在 metricbeats 中是否有替代方法?我不想创建一个全新的 api 来进行计算并添加一个单独的字段

-- 更新。

奇怪的是,数组中的数值似乎 return 预期的结果。即

doc['http.f5pools.items.ratio']

returns

 {
  "_id": "BT6WdWsBXQSGm_pQBbCa",
  "pool.MemberCount": [
   1,
   1
  ]
 },

-- 更新2

好的,如果字段中的字符串具有不同的值,那么您将获得所有值。如果它们相同,您就得到一个。卧槽?

好的,解决了。

https://discuss.elastic.co/t/problem-looping-through-array-in-each-doc-with-painless/90648

所以我发现数组被预过滤为仅 return 个不同的值(显然是整数的情况除外?)

解决方案是使用params._source代替doc[]

The answer for why doc doesnt work

引用如下:

Doc values are a columnar field value store, enabled by default on all fields except for analyzed text fields.

Doc-values can only return "simple" field values like numbers, dates, geo- points, terms, etc, or arrays of these values if the field is multi-valued. It cannot return JSON objects

此外,添加 空检查 很重要,如下所述:

Missing fields

The doc['field'] will throw an error if field is missing from the mappings. In painless, a check can first be done with doc.containsKey('field')* to guard accessing the doc map. Unfortunately, there is no way to check for the existence of the field in mappings in an expression script.

Also, here is why _source works

引用如下:

The document _source, which is really just a special stored field, can be accessed using the _source.field_name syntax. The _source is loaded as a map-of-maps, so properties within object fields can be accessed as, for example, _source.name.first.

.

通过示例回复您的评论:

这里的关键词是:它不能returnJSON对象。现场文档['http.f5pools.items']是一个JSON对象

尝试下面的 运行 并查看它创建的映射:

PUT t5/doc/2
{
   "items": [
     {
      "monitor": "default"
     },
     {
      "monitor": "default"
     }
    ]
}


GET t5/_mapping

{
  "t5" : {
    "mappings" : {
      "doc" : {
        "properties" : {
          "items" : {
            "properties" : {
              "monitor" : {  <-- monitor is a property of items property(Object)
                "type" : "text",
                "fields" : {
                  "keyword" : {
                    "type" : "keyword",
                    "ignore_above" : 256
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

我正在添加另一个答案,而不是删除我之前的答案,这不是实际问题,但将来可能对其他人有所帮助。

我在 same documentation 中找到了提示:

Doc values are a columnar field value store

进一步谷歌搜索后,我发现了这个Doc Value Intro,它说文档值本质上"uninverted index"对排序等操作很有用;我的假设是在排序时你基本上不希望重复相同的值,因此他们使用的数据结构删除了那些重复项。这仍然没有回答为什么它对字符串的工作方式不同于数字。数字被保留,但字符串被过滤成唯一的。

This “uninverted” structure is often called a “column-store” in other systems. Essentially, it stores all the values for a single field together in a single column of data, which makes it very efficient for operations like sorting.

In Elasticsearch, this column-store is known as doc values, and is enabled by default. Doc values are created at index-time: when a field is indexed, Elasticsearch adds the tokens to the inverted index for search. But it also extracts the terms and adds them to the columnar doc values.

更多 deep-dive into doc values revealed it a compression technique 实际上 de-deuplicates 有效和 memory-friendly 操作的值。

上面 link 中给出的注释回答了问题:

You may be thinking "Well that’s great for numbers, but what about strings?" Strings are encoded similarly, with the help of an ordinal table. The strings are de-duplicated and sorted into a table, assigned an ID, and then those ID’s are used as numeric doc values. Which means strings enjoy many of the same compression benefits that numerics do.

The ordinal table itself has some compression tricks, such as using fixed, variable or prefix-encoded strings.

另外,如果您不想要这种行为,那么您可以 disable doc-values