Elasticsearch 7.2 集群会议未分配分片

Elasticsearch 7.2 cluster meeting unassigned shards

我想用7.2版本搭建一个三节点的Elasticsearch集群,但是有点意外

我有三个虚拟机:192.168.7.2、192.168.7.3、192.168.7.4,它们的主要配置在config/elasticsearch.yml

cluster.name: ucas
node.name: node-2
network.host: 192.168.7.2
http.port: 9200
discovery.seed_hosts: ["192.168.7.2", "192.168.7.3", "192.168.7.4"]
cluster.initial_master_nodes: ["node-2", "node-3", "node-4"]
http.cors.enabled: true
http.cors.allow-origin: "*"
cluster.name: ucas
node.name: node-3
network.host: 192.168.7.3
http.port: 9200
discovery.seed_hosts: ["192.168.7.2", "192.168.7.3", "192.168.7.4"]
cluster.initial_master_nodes: ["node-2", "node-3", "node-4"]
cluster.name: ucas
node.name: node-4
network.host: 192.168.7.4
http.port: 9200
discovery.seed_hosts: ["192.168.7.2", "192.168.7.3", "192.168.7.4"]
cluster.initial_master_nodes: ["node-2", "node-3", "node-4"]

当我启动每个节点时,创建一个名为 movie 的索引,其中包含 3 个分片和 0 个副本,然后将一些文档写入索引,集群看起来正常:

PUT moive
{
  "settings": {
    "number_of_shards": 3,
    "number_of_replicas": 0
  }
}


PUT moive/_doc/3
{
  "title":"title 3"
}

然后,将 movie 副本设置为 1:

PUT moive/_settings
{
  "number_of_replicas": 1
}

一切顺利,但是当我将 movie 副本设置为 2 时:

PUT moive/_settings
{
  "number_of_replicas": 2
}

无法将新副本分配给 node2。

不知道哪一步不对,请大家帮忙讨论一下

首先使用explain命令查找无法分配分片的原因:


GET _cluster/allocation/explain?pretty



{
  "index" : "moive",
  "shard" : 2,
  "primary" : false,
  "current_state" : "unassigned",
  "unassigned_info" : {
    "reason" : "NODE_LEFT",
    "at" : "2019-07-19T06:47:29.704Z",
    "details" : "node_left [tIm8GrisRya8jl_n9lc3MQ]",
    "last_allocation_status" : "no_attempt"
  },
  "can_allocate" : "no",
  "allocate_explanation" : "cannot allocate because allocation is not permitted to any of the nodes",
  "node_allocation_decisions" : [
    {
      "node_id" : "kQ0Noq8LSpyEcVDF1POfJw",
      "node_name" : "node-3",
      "transport_address" : "192.168.7.3:9300",
      "node_attributes" : {
        "ml.machine_memory" : "5033172992",
        "ml.max_open_jobs" : "20",
        "xpack.installed" : "true"
      },
      "node_decision" : "no",
      "store" : {
        "matching_sync_id" : true
      },
      "deciders" : [
        {
          "decider" : "same_shard",
          "decision" : "NO",
          "explanation" : "the shard cannot be allocated to the same node on which a copy of the shard already exists [[moive][2], node[kQ0Noq8LSpyEcVDF1POfJw], [R], s[STARTED], a[id=Ul73SPyaTSyGah7Yl3k2zA]]"
        }
      ]
    },
    {
      "node_id" : "mNpqD9WPRrKsyntk2GKHMQ",
      "node_name" : "node-4",
      "transport_address" : "192.168.7.4:9300",
      "node_attributes" : {
        "ml.machine_memory" : "5033172992",
        "ml.max_open_jobs" : "20",
        "xpack.installed" : "true"
      },
      "node_decision" : "no",
      "store" : {
        "matching_sync_id" : true
      },
      "deciders" : [
        {
          "decider" : "same_shard",
          "decision" : "NO",
          "explanation" : "the shard cannot be allocated to the same node on which a copy of the shard already exists [[moive][2], node[mNpqD9WPRrKsyntk2GKHMQ], [P], s[STARTED], a[id=yQo1HUqoSdecD-SZyYMYfg]]"
        }
      ]
    },
    {
      "node_id" : "tIm8GrisRya8jl_n9lc3MQ",
      "node_name" : "node-2",
      "transport_address" : "192.168.7.2:9300",
      "node_attributes" : {
        "ml.machine_memory" : "5033172992",
        "ml.max_open_jobs" : "20",
        "xpack.installed" : "true"
      },
      "node_decision" : "no",
      "deciders" : [
        {
          "decider" : "disk_threshold",
          "decision" : "NO",
          "explanation" : "the node is above the low watermark cluster setting [cluster.routing.allocation.disk.watermark.low=85%], using more disk space than the maximum allowed [85.0%], actual free: [2.2790256709451573E-4%]"
        }
      ]
    }
  ]
}

我们可以看到node-2的磁盘space已满:

[vagrant@node2 ~]$ df -h
Filesystem               Size  Used Avail Use% Mounted on
/dev/mapper/centos-root  8.4G  8.0G  480M  95% /
devtmpfs                 2.4G     0  2.4G   0% /dev
tmpfs                    2.4G     0  2.4G   0% /dev/shm
tmpfs                    2.4G  8.4M  2.4G   1% /run
tmpfs                    2.4G     0  2.4G   0% /sys/fs/cgroup
/dev/sda1                497M  118M  379M  24% /boot
none                     234G  149G   86G  64% /vagrant

然后我清理了磁盘 space 一切都恢复正常了: