使用父子关系重新索引 Elasticsearch 索引

Reindexing Elasticsearch index with parent and child relationship

我们目前有一个 'message' 可以有一个 link 到一个 'parent' 消息。例如。回复会将原始消息作为 parent_id.

PUT {
  "mappings": {
    "message": {
      "properties": {
        "subject": {
          "type": "text"
         },
         "body" : {
            "type" : "text"
         },
         "parent_id" : {
            "type" : "long"
          }
        }
      }
    }
  }
}

目前我们在文档上没有 elasticsearch 父子连接,因为不允许父子属于同一类型。现在有了 5.6 和 elastic 的驱动以摆脱类型,我们现在正尝试在 5.6 中使用新的父子连接。

PUT {
  "settings": {
    "mapping.single_type": true
  },
  "mappings": {
    "message": {
      "properties": {
        "subject": {
          "type": "text"
         },
         "body" : {
            "type" : "text"
         },
         "join_field": {
            "type" : "join",
            "relations": {
                "parent_message":"child_message"
            }
        }
        }
      }
    }
  }
}

我知道我必须为此创建一个新索引,然后使用 _reindex 重新索引所有内容,但我不太确定该怎么做。

如果我索引一个 parent_message 就很简单

PUT localhost:9200/testm1/message/1 
{
        "subject": "Message 1",
         "body" : "body 1"
}
PUT localhost:9200/testm1/message/3?routing=1
{
        "subject": "Message Reply to 1",
         "body" : "body 3",
          "join_field": {
            "name": "child_message",
            "parent": "1"
    }
 }

搜索现在 return

{
                "_index": "testm1",
                "_type": "message",
                "_id": "2",
                "_score": 1,
                "_source": {
                    "subject": "Message 2",
                    "body": "body 2"
                }
            },
            {
                "_index": "testm1",
                "_type": "message",
                "_id": "1",
                "_score": 1,
                "_source": {
                    "subject": "Message 1",
                    "body": "body 1"
                }
            },
            {
                "_index": "testm1",
                "_type": "message",
                "_id": "3",
                "_score": 1,
                "_routing": "1",
                "_source": {
                    "subject": "Message Reply to 1",
                    "body": "body 3",
                    "join_field": {
                        "name": "child_message",
                        "parent": "1"
                    }
                }
            }

我尝试创建新索引 (testmnew),然后执行 _reindex

POST _reindex
{
    "source": {
        "index" : "testm"
    },
    "dest" :{
        "index" : "testmnew"
    },
    "script" : {
        "inline" : """
        ctx._routing = ctx._source.parent_id;
 --> Missing need to set join_field here as well I guess <--
        """
        }
}

脚本对我来说还是不太清楚。但我走在正确的道路上吗?我会简单地在消息上设置_routing吗(在父消息上将为空)。但是我如何为子消息设置 join_field 呢?

这是我最后使用的重建索引脚本:

curl -XPOST 'localhost:9200/_reindex' -H 'Content-Type: application/json' -d'
{
    "source": {
        "index" : "testm"
    },
    "dest" :{
        "index" : "testmnew"
    },
    "script" : {
        "lang" : "painless",
        "source" : "if(ctx._source.parent_id != null){ctx._routing = ctx._source.parent_id; ctx._source.join_field=  params.cjoin; ctx._source.join_field.parent = ctx._source.parent_id;}else{ctx._source.join_field = params.parent_join}",
        "params" : {
            "cjoin" :{
                "name": "child_message",
                "parent": 1
            },
            "parent_join" : {"name": "parent_message"}

        }
    }
}
'