etcd集群设置失败

Etcd cluster setup failure

我正在尝试在 Ubuntu 机器上设置 3 节点 etcd 集群作为 docker 网络数据存储。我使用 etcd docker 映像成功创建了 etcd 集群。现在,当我尝试复制它时,这些步骤在一个节点上失败了。即使在从 step up 中删除了故障节点之后,集群仍在寻找已删除的节点。当我使用 etcd 二进制文件时遇到同样的错误。

通过在所有节点上相应地更改 ip 使用以下命令:

docker run -d -v /usr/share/ca-certificates/:/etc/ssl/certs -p 4001:4001 -p 2380:2380 -p 2379:2379 \
 --name etcd quay.io/coreos/etcd \
 -name etcd0 \
 -advertise-client-urls http://172.27.59.141:2379,http://172.27.59.141:4001 \
 -listen-client-urls http://0.0.0.0:2379,http://0.0.0.0:4001 \
 -initial-advertise-peer-urls http://172.27.59.141:2380 \
 -listen-peer-urls http://0.0.0.0:2380 \
 -initial-cluster-token etcd-cluster-1 \
 -initial-cluster etcd0=http://172.27.59.141:2380,etcd1=http://172.27.59.244:2380,etcd2=http://172.27.59.232:2380 \
 -initial-cluster-state new

其中两个节点连接正常,但第三个节点的服务停止。以下是第三个节点的日志。

2016-06-16 17:16:34.293248 I | etcdmain: etcd Version: 2.3.6
2016-06-16 17:16:34.294368 I | etcdmain: Git SHA: 128344c
2016-06-16 17:16:34.294584 I | etcdmain: Go Version: go1.6.2
2016-06-16 17:16:34.294781 I | etcdmain: Go OS/Arch: linux/amd64
2016-06-16 17:16:34.294962 I | etcdmain: setting maximum number of CPUs to 2, total number of available CPUs is 2
2016-06-16 17:16:34.295142 W | etcdmain: no data-dir provided, using default data-dir ./node2.etcd
2016-06-16 17:16:34.295438 I | etcdmain: listening for peers on http://0.0.0.0:2380
2016-06-16 17:16:34.295654 I | etcdmain: listening for client requests on http://0.0.0.0:2379
2016-06-16 17:16:34.295846 I | etcdmain: listening for client requests on http://0.0.0.0:4001
2016-06-16 17:16:34.296193 I | etcdmain: stopping listening for client requests on http://0.0.0.0:4001
2016-06-16 17:16:34.301139 I | etcdmain: stopping listening for client requests on http://0.0.0.0:2379
2016-06-16 17:16:34.301454 I | etcdmain: stopping listening for peers on http://0.0.0.0:2380
2016-06-16 17:16:34.301718 I | etcdmain: --initial-cluster must include node2=http://172.27.59.232:2380 given --initial-advertise-peer-urls=http://172.27.59.232:2380

即使在删除故障节点后,我也可以看到两个节点正在等待第三个节点连接。

2016-06-16 17:16:12.063893 N | etcdserver: added member 17879927ec74147b [http://172.27.59.232:238] to cluster ba4424e006edb53e
2016-06-16 17:16:12.064431 N | etcdserver: added local member 24d9feabb7e2f26f [http://172.27.59.244:2380] to cluster ba4424e006edb53e
2016-06-16 17:16:12.065229 N | etcdserver: added member 2bda70be57138cfe [http://172.27.59.141:2380] to cluster ba4424e006edb53e
2016-06-16 17:16:12.218560 I | raft: 24d9feabb7e2f26f [term: 1] received a MsgVote message with higher term from 2bda70be57138cfe [term: 29]
2016-06-16 17:16:12.218964 I | raft: 24d9feabb7e2f26f became follower at term 29
2016-06-16 17:16:12.219276 I | raft: 24d9feabb7e2f26f [logterm: 1, index: 3, vote: 0] voted for 2bda70be57138cfe [logterm: 1, index: 3] at term 29
2016-06-16 17:16:12.222667 I | raft: raft.node: 24d9feabb7e2f26f elected leader 2bda70be57138cfe at term 29
2016-06-16 17:16:12.335904 I | etcdserver: published {Name:node1 ClientURLs:[http://172.27.59.244:2379 http://172.27.59.244:4001]} to cluster ba4424e006edb53e
2016-06-16 17:16:12.336459 N | etcdserver: set the initial cluster version to 2.2
2016-06-16 17:16:42.059177 W | rafthttp: the connection to peer 17879927ec74147b is unhealthy
2016-06-16 17:17:12.060313 W | rafthttp: the connection to peer 17879927ec74147b is unhealthy
2016-06-16 17:17:42.060986 W | rafthttp: the connection to peer 17879927ec74147b is unhealthy

可以看出,尽管用两个节点启动了集群,但它仍在搜索第三个节点。

本地磁盘上是否有一个位置正在保存数据并在未提供的情况下提取旧数据。

请指出我所缺少的。

Is there a location on local disk where data is being saved and its picking up old data despite it being not provided.

是的,会员数据已经存储在node0.etcdnode1.etcd

您可以从日志中得到以下消息,表明该服务器已经属于一个集群:

etcdmain: the server is already initialized as member before, starting as etcd member...

为了 运行 一个有两个成员的新集群,只需在命令中添加另一个参数:

--data-dir bak