在 ElasticSearch 中存储混合数据类型

Store mixed data-type in ElasticSearch

我正在使用 logstash 来管理我的应用程序日志。我想将一些上下文数据与日志条目一起存储。这些上下文数据不必编入索引。但它可以有不同的 structure/data 类型,具体取决于应用程序上下文。例如上下文可以是以下任意一种格式

字符串

{
    error: "This is a sample error message"
}

数组

{
    error: [
        "This is an error message", 
        "This is another message", 
        "This is the final message"
    ]
}

或者它可以是一个对象

{
    error: {
        user_name: "Username cannot be empty",
        user_email: "Email address is already in use",
        user_password: "Passwords do not match"
    }
}

ElasticSearch中是否可以有这样的字段?该字段不必建立索引,只需要存储即可。

我认为不可能完全按照您的要求进行操作。不过,您可以免费获得前两个示例,因为任何字段都可以是列表:

curl -XDELETE "http://localhost:9200/test_index"

curl -XPUT "http://localhost:9200/test_index" -d'
{
    "mappings": {
        "doc": {
            "properties": {
                "error": {
                    "type": "string",
                    "index": "not_analyzed"
                }
            }
        }
    }
}'

curl -XPUT "http://localhost:9200/test_index/doc/1" -d'
{
    "error": "This is a sample error message"
}'

curl -XPUT "http://localhost:9200/test_index/doc/2" -d'
{
    "error": [
        "This is an error message", 
        "This is another message", 
        "This is the final message"
    ]
}'

curl -XPOST "http://localhost:9200/test_index/_search"
...
{
   "took": 2,
   "timed_out": false,
   "_shards": {
      "total": 5,
      "successful": 5,
      "failed": 0
   },
   "hits": {
      "total": 2,
      "max_score": 1,
      "hits": [
         {
            "_index": "test_index",
            "_type": "doc",
            "_id": "1",
            "_score": 1,
            "_source": {
               "error": "This is a sample error message"
            }
         },
         {
            "_index": "test_index",
            "_type": "doc",
            "_id": "2",
            "_score": 1,
            "_source": {
               "error": [
                  "This is an error message",
                  "This is another message",
                  "This is the final message"
               ]
            }
         }
      ]
   }
}

或者,您可以根据第三个示例设置映射,然后只使用每个文档所需的字段(可能会使您的应用程序代码复杂化):

curl -XDELETE "http://localhost:9200/test_index"

curl -XPUT "http://localhost:9200/test_index"

curl -XPUT "http://localhost:9200/test_index/doc/3" -d'
{
    "error": {
        "user_name": "Username cannot be empty",
        "user_email": "Email address is already in use",
        "user_password": "Passwords do not match"
    }
}'

curl -XGET "http://localhost:9200/test_index/_mapping"
...
{
   "test_index": {
      "mappings": {
         "doc": {
            "properties": {
               "error": {
                  "properties": {
                     "user_email": {
                        "type": "string"
                     },
                     "user_name": {
                        "type": "string"
                     },
                     "user_password": {
                        "type": "string"
                     }
                  }
               }
            }
         }
      }
   }
}

所以基本上你的问题的直接答案是 "No",除非我遗漏了什么(这是很有可能的)。

这是我使用的代码:

http://sense.qbox.io/gist/18476aa6c2ad2fa554b472d09934559c884bec33