Elasticsearch Copy_to 数据需要复制自己的子文档

Elasticsearch Copy_to data need to copied self subdocument

在此先感谢您的帮助。

我已将 ES 映射创建为:

{"mappings": {
            "policy": {
                "properties": {
                    "name": {
                        "type": "text",
                        "fields": {
                            "keyword": {
                                "type": "keyword",
                                "ignore_above": 256
                            }
                        }
                    },
                    "tags": {
                        "properties": {
                            "scope": {
                                "type": "text",
                                "store": "true",
                                "copy_to": [
                                    "tags.tag_scope"
                                ]
                            },
                            "tag": {
                                "type": "text",
                                "store": "true",
                                "copy_to": [
                                    "tags.tag_scope"
                                ]
                            },
                            "tag_scope": {
                                "type": "text",
                                "store": "true"
                            }
                        }
                    }
                }
            }
        }

    }

当我索引策略文档时,将来自不同标签文档的所有标签和范围值复制到 tag_scope 属性。

例如我添加了一个关于elasticsearch的文档:

{
                    "name": "policy1",
                    "tags": [
                        {
                            "tag": "pepsi",
                            "scope": "prod"
                        },
                        {
                            "tag": "coke",
                            "scope": "dev"
                        }
                    ]
                }

它将 tag_scope 文档中的所有 4 个值存储为:

"tags.tag_scope": [ "pepsi", "test", "coke", "dev" ]

我的例外是,它应该像这样存储:

 {
                        "name": "policy1",
                        "tags": [
                            {
                                "tag": "pepsi",
                                "scope": "prod",
                                 "tag_scope" : ["pepsi","prod"]
                            },
                            {
                                "tag": "coke",
                                "scope": "dev",
                                 "tag_scope" : ["coke","dev"]
                            }
                        ]
                    }

你能帮我做正确的映射吗?

您要找的是Nested Datatype。将您的映射更改为以下内容:

PUT <your_index_name>
{  
   "mappings":{  
      "policy":{ 
         "properties":{  
            "name":{  
               "type":"text",
               "fields":{  
                  "keyword":{  
                     "type":"keyword",
                     "ignore_above":256
                  }
               }
            },
            "tags":{  
               "type": "nested", 
               "properties":{  
                  "scope":{  
                     "type":"text",
                     "store":"true",
                     "copy_to":[  
                        "tags.tag_scope"
                     ]
                  },
                  "tag":{  
                     "type":"text",
                     "store":"true",
                     "copy_to":[  
                        "tags.tag_scope"
                     ]
                  },
                  "tag_scope":{  
                     "type":"text",
                     "store":"true",
                     "fields": {                <---- Added this
                       "keyword": {
                          "type": "keyword"
                       }
                     }
                  }
               }
            }
         }
      }
   }
}

请注意我是如何将 tags 设为 nested 类型的。这将允许将以下内容存储为单独的文档本身,在您的情况下 tags 基本上有两个嵌套文档。

{  
   "tag":"coke",
   "scope":"dev"
}

现在您的 tags.tag_scope 应该是您期望的样子了。

现在,当涉及到查询您正在寻找的内容时,下面是 Nested Query 应该如何。

嵌套查询:

POST <your_index_name>/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "nested": {
            "path": "tags",
            "query": {
              "bool": {
                "must": [
                  {
                    "match": {
                      "tags.tag_scope": "pepsi"
                    }
                  },
                  {
                    "match": {
                      "tags.tag_scope": "prod"
                    }
                  }
                ]
              }
            }
          }
        }
      ]
    }
  }
}

至于 return 唯一 tags.tag_scope 值的列表,您需要 return 聚合查询。请注意,我提到了 size:0,这意味着我只想查看聚合结果而不是正常的查询结果。

聚合查询:

POST <your_index_name>/_search
{  
   "size":0,
   "query":{  
      "bool":{  
         "must":[  
            {  
               "nested":{  
                  "path":"tags",
                  "query":{  
                     "bool":{  
                        "must":[  
                           {  
                              "match":{  
                                 "tags.tag_scope":"pepsi"
                              }
                           },
                           {  
                              "match":{  
                                 "tags.tag_scope":"prod"
                              }
                           }
                        ]
                     }
                  }
               }
            }
         ]
      }
   },
   "aggs":{                        <----- Aggregation Query Starts Here
      "myscope":{  
         "nested":{  
            "path":"tags"
         },
         "aggs":{  
            "uniqui_scope":{  
               "terms":{  
                  "field":"tags.tag_scope.keyword",
                  "size":10
               }
            }
         }
      }
   }
}

聚合响应:

{
  "took": 53,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "myscope": {
      "doc_count": 2,
      "uniqui_scope": {
        "doc_count_error_upper_bound": 0,
        "sum_other_doc_count": 0,
        "buckets": [
          {
            "key": "coke",
            "doc_count": 1
          },
          {
            "key": "dev",
            "doc_count": 1
          },
          {
            "key": "pepsi",
            "doc_count": 1
          },
          {
            "key": "prod",
            "doc_count": 1
          }
        ]
      }
    }
  }
}

希望这对您有所帮助。