在弹性搜索索引模板中使用 Ingest Attachment Plugin

Using Ingest Attachment Plugin within elastic search index template

我正在尝试将 1.3.2 上的当前弹性搜索模式更新为最新版本。对于其中一个索引,当前架构如下所示:

curl -XPOST localhost:9200/_template/<INDEXNAME> -d '{
    "template" : "*-<INDEXNAME_TYPE>",
    "index.mapping.attachment.indexed_chars": -1,
    "mappings" : {
        "post" : {
            "properties" : {
                "sub" : { "type" : "string" },
                "sender" : { "type" : "string" },
                "dt" : { "type" : "date", "format" : "EEE, d MMM yyyy HH:mm:ss Z" },
                "body" : { "type" : "string"},
                "attachments" : {
                    "type" : "attachment",
                    "path" : "full",
                    "fields" : {
                        "attachments" : {
                            "type" : "string",
                            "term_vector" : "with_positions_offsets",
                            "store" : true
                        },
                        "name" : {"store" : "yes"},
                        "title" : {"store" : "yes"},
                        "date" : {"store" : "yes"},
                        "content_type" : {"store" : "yes"},
                        "content_length" : {"store" : "yes"}
                    }
                }
            }
        }
    }
}'

在我的旧版 Elastic Search 中,安装了一个“mapper-attachment”插件。我知道“mapper-attachment”插件已被“Ingest Attachment Processor”取代,并遵循 plugins' website 中的示例,我理解他们创建管道的示例,

PUT _ingest/pipeline/attachment
  {
    "description" : "Extract attachment information from arrays",
    "processors" : [
      {
        "foreach": {
          "field": "attachments",
          "processor": {
            "attachment": {
              "target_field": "_ingest._value.attachment",
              "field": "_ingest._value.data",
              "indexed_chars" : -1
            }
          }
        }
      }
    ]
  }

  PUT my-index-000001/_doc/my_id?pipeline=attachment
  {
    "sub" : "This is a test post",
    "sender" : "jane.doe@gmail.com",
    "dt" : "Sat, 15 Jan 2022 08:50:00 AEST"
    "body" : "Test Body",
    "fromaddr": "jane.doe@gmail.com",
    "toaddr": "larne.jones@gmail.com",
    "attachments" : [
      {
        "filename" : "ipsum.txt",
        "data" : "dGhpcyBpcwpqdXN0IHNvbWUgdGV4dAo="
      },
      {
        "filename" : "test.txt",
        "data" : "VGhpcyBpcyBhIHRlc3QK"
      }
    ]
  } 

如何使用这个新的附件处理器来创建我以前的索引模板?

注意:使用我的索引和架构,对于每个“post”,将有一个或多个附件,

答案是,与之前的版本不同,我不能使用附件的数据类型。所以按照 elastic.co 网站和我自己的问题的例子,答案就在我的问题本身。

  • 第 1 步:按照问题
  • 创建管道
  • 第二次创建模式[见下文]
  • 3rd 如问题所示插入数据。将数据插入索引时,使用 pipeline=attachment 作为管道的名称,插件会将给定的附件解析为上面的模式
curl -XPOST localhost:9200/_template/<INDEXNAME> -d '{
    "template" : "*-<INDEXNAME_TYPE>",
    "index.mapping.attachment.indexed_chars": -1,
    "mappings" : {
        "post" : {
            "properties" : {
                "sub" : { "type" : "string" },
                "sender" : { "type" : "string" },
                "dt" : { "type" : "date", "format" : "EEE, d MMM yyyy HH:mm:ss Z" },
                "body" : { "type" : "string"},
                "attachments" : {
                    "properties" : {
                        "attachment" : {
                            "properties" : {
                                "content" : { 
                                    "type" : "text",
                                    "store": true,
                                    "term_vector": "with_positions_offsets"
                                 },
                                "content_length" : { "type" : "long" },
                                "content_type" : { "type" : "keyword" },
                                "language" : { "type" : "keyword"},
                                "date" : { "type" : "date", "format" : "EEE, d MMM yyyy HH:mm:ss Z" }
                            }
                        },
                        "content" : { "type": "keyword" },
                        "name" : { "type" : "keyword" }
                    }
                }
            }
        }
    }
}'