运行文档所有字段上的 Elasticsearch 处理器

Question

我正在尝试 trim 并将索引到 Elasticsearch 中的文档的所有值小写

可用的处理器具有必填字段键。这意味着一个处理器只能在一个字段上使用

有没有办法运行处理文档的所有字段？

Answer 1

当然有。使用 script processor 但要注意保留键，例如 _type、_id 等：

PUT _ingest/pipeline/my_string_trimmer
{
  "description": "Trims and lowercases all string values",
  "processors": [
    {
      "script": {
        "source": """
          def forbidden_keys = [
            '_type',
            '_id',
            '_version_type',
            '_index',
            '_version'
          ];
          
          def corrected_source = [:];
          
          for (pair in ctx.entrySet()) {
            def key = pair.getKey();
            if (forbidden_keys.contains(key)) {
              continue;
            }
            def value = pair.getValue();
            
            if (value instanceof String) {
              corrected_source[key] = value.trim().toLowerCase();
            } else {
              corrected_source[key] = value;
            }
          }
          
          // overwrite the original
          ctx.putAll(corrected_source);
        """
      }
    }
  ]
}

使用示例文档进行测试：

POST my-index/_doc?pipeline=my_string_trimmer
{
  "abc": " DEF ",
  "def": 123,
  "xyz": false
}

运行文档所有字段上的 Elasticsearch 处理器

Run Elasticsearch processor on all the fields of a document

processor

elasticsearch

运行 文档所有字段上的 Elasticsearch 处理器

Run Elasticsearch processor on all the fields of a document

processor

elasticsearch

运行文档所有字段上的 Elasticsearch 处理器