如何从 CoreOS 上 docker 映像中的快照恢复 etcd 集群?
How can I restore etcd cluster from snapshot in docker image on CoreOS?
我在 vmware 上有一个 Kubernetes 集群 (v1.5.6) 和 3 个节点的 etcd 集群(etcd 版本 3.1.5)。
此 etcd 节点 运行 在 vmware 上的 coreos 上的三个 docker 容器(在三个主机上)中。
我尝试使用以下解决方案备份 etcd:
docker run --rm --net=host -v /tmp:/etcd_backup -e ETCDCTL_API=3 quay.io/coreos/etcd:v3.1.5 etcdctl --endpoints=[1.1.1.1:2379,2.2.2.2:2379,3.3.3.3:2379] snapshot save etcd_backup/snapshot.db
备份已成功完成。
我想在另一个 vmware 环境中从零开始创建这个 kubernetes 集群,但我需要为此从快照恢复 etcd。
到目前为止,我还没有找到适用于 docker 容器中 etcd 的正确解决方案。
我尝试用下面的方法恢复,可惜没有成功
首先,我在 运行 执行以下命令后创建了一个新的 etcd 节点:
docker run --rm --net=host -v /tmp/etcd_bak:/etcd_backup -e ETCDCTL_API=3 registry:5000/quay.io/coreos/etcd:v3.1.5 etcdctl snapshot restore etcd_backup/snapshot.db --name etcd0 --initial-cluster etcd0=http://etcd0:2380,etcd1=http://etcd1:2380,etcd2=http://etcd2:2380 --initial-cluster-token etcd-cluster-1 --initial-advertise-peer-urls http://etcd0:2380
结果:
2018-06-04 09:25:52.314747 I | etcdserver/membership: added member 7ff5c9c6942f82e [http://etcd0:2380] to cluster 5d1b637f4b7740d5
2018-06-04 09:25:52.314940 I | etcdserver/membership: added member 91b417e7701c2eeb [http://etcd2:2380] to cluster 5d1b637f4b7740d5
2018-06-04 09:25:52.315096 I | etcdserver/membership: added member faeb78734ee4a93d [http://etcd1:2380] to cluster 5d1b637f4b7740d5
不幸的是,没有任何反应。
恢复etcd备份有什么好的解决方案?
如何创建一个空的 etcd cluster/node 以及我应该如何恢复快照?
根据 Etcd Disaster Recovery 文档,您需要使用像您这样的命令从快照中恢复所有三个 etcd 节点,然后 运行 使用这样的命令恢复三个节点:
etcd \
--name m1 \
--listen-client-urls http://host1:2379 \
--advertise-client-urls http://host1:2379 \
--listen-peer-urls http://host1:2380 &
另外,你可以从镜像中提取etcdctl,像这样:
docker run --rm -v /opt/bin:/opt/bin registry:5000/quay.io/coreos/etcd:v3.1.5 cp /usr/local/bin/etcdctl /opt/bin
然后使用etcdctl恢复快照:
# ETCDCTL_API=3 ./etcdctl snapshot restore snapshot.db \
--name m1 \
--initial-cluster m1=http://host1:2380,m2=http://host2:2380,m3=http://host3:2380 \
--initial-cluster-token etcd-cluster-1 \
--initial-advertise-peer-urls http://host1:2380 \
--data-dir /var/lib/etcd
这会将快照还原到 /var/lib/etcd 目录。然后使用 docker 启动 etcd,不要忘记将 /var/lib/etcd 挂载到您的容器中,并为其指定 --data-dir .
kubernetes 中的 Ectd 是 Docker 容器中的 运行,这是我恢复集群所做的工作:
检索 Etcd 集群元数据
docker inspect etcd1
你会得到类似下面的东西:
"Binds": [
"/etc/ssl/certs:/etc/ssl/certs:ro",
"/etc/ssl/etcd/ssl:/etc/ssl/etcd/ssl:ro",
"/var/lib/etcd:/var/lib/etcd:rw"
],
...
"Env": [
"ETCD_DATA_DIR=/var/lib/etcd",
"ETCD_ADVERTISE_CLIENT_URLS=https://172.16.60.1:2379",
"ETCD_INITIAL_ADVERTISE_PEER_URLS=https://172.16.60.1:2380",
"ETCD_INITIAL_CLUSTER_STATE=existing",
"ETCD_METRICS=basic",
"ETCD_LISTEN_CLIENT_URLS=https://172.16.60.1:2379,https://127.0.0.1:2379",
"ETCD_ELECTION_TIMEOUT=5000",
"ETCD_HEARTBEAT_INTERVAL=250",
"ETCD_INITIAL_CLUSTER_TOKEN=k8s_etcd",
"ETCD_LISTEN_PEER_URLS=https://172.16.60.1:2380",
"ETCD_NAME=etcd1",
"ETCD_PROXY=off",
"ETCD_INITIAL_CLUSTER=etcd1=https://172.16.60.1:2380,etcd2=https://172.16.60.2:2380,etcd3=https://172.16.60.2:2380",
"ETCD_AUTO_COMPACTION_RETENTION=8",
"ETCD_TRUSTED_CA_FILE=/etc/ssl/etcd/ssl/ca.pem",
"ETCD_CERT_FILE=/etc/ssl/etcd/ssl/member-node01.pem",
"ETCD_KEY_FILE=/etc/ssl/etcd/ssl/member-node01-key.pem",
"ETCD_PEER_TRUSTED_CA_FILE=/etc/ssl/etcd/ssl/ca.pem",
"ETCD_PEER_CERT_FILE=/etc/ssl/etcd/ssl/member-node01.pem",
"ETCD_PEER_KEY_FILE=/etc/ssl/etcd/ssl/member-node01-key.pem",
"ETCD_PEER_CLIENT_CERT_AUTH=true",
"PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
],
"Cmd": [
"/usr/local/bin/etcd"
],
复制etcd snapshotdb到其他etcd节点
scp snapshotdb_20180913 node02:/root/
scp snapshotdb_20180913 node03:/root/
使用原始信息重建新集群
# etcd1
docker stop etcd1
rm -rf /var/lib/etcd
ETCDCTL_API=3 etcdctl snapshot restore snapshotdb_20180913 \
--cacert /etc/ssl/etcd/ssl/ca.pem \
--cert /etc/ssl/etcd/ssl/member-node01.pem \
--key /etc/ssl/etcd/ssl/member-node01-key.pem \
--name etcd1 \
--initial-cluster etcd1=https://node01:2380,etcd2=https://node02:2380,etcd3=https://node03:2380 \
--initial-cluster-token k8s_etcd \
--initial-advertise-peer-urls https://node01:2380 \
--data-dir /var/lib/etcd
# etcd2
docker stop etcd2
rm -rf /var/lib/etcd
ETCDCTL_API=3 etcdctl snapshot restore snapshotdb_20180913 \
--cacert /etc/ssl/etcd/ssl/ca.pem \
--cert /etc/ssl/etcd/ssl/member-node02.pem \
--key /etc/ssl/etcd/ssl/member-node02-key.pem \
--name etcd2 \
--initial-cluster etcd1=https://node01:2380,etcd2=https://node02:2380,etcd3=https://node03:2380 \
--initial-cluster-token k8s_etcd \
--initial-advertise-peer-urls https://node02:2380 \
--data-dir /var/lib/etcd
# etcd3
docker stop etcd3
rm -rf /var/lib/etcd
ETCDCTL_API=3 etcdctl snapshot restore snapshotdb_20180913 \
--cacert /etc/ssl/etcd/ssl/ca.pem \
--cert /etc/ssl/etcd/ssl/member-node03.pem \
--key /etc/ssl/etcd/ssl/member-node03-key.pem \
--name etcd3 \
--initial-cluster etcd1=https://node01:2380,etcd2=https://node02:2380,etcd3=https://node03:2380 \
--initial-cluster-token k8s_etcd \
--initial-advertise-peer-urls https://node03:2380 \
--data-dir /var/lib/etcd
启动容器并检查集群状态
cd /etc/ssl/etcd/ssl
etcdctl \
--endpoints=https://node01:2379 \
--ca-file=./ca.pem \
--cert-file=./member-node01.pem \
--key-file=./member-node01-key.pem \
member list
我在 vmware 上有一个 Kubernetes 集群 (v1.5.6) 和 3 个节点的 etcd 集群(etcd 版本 3.1.5)。 此 etcd 节点 运行 在 vmware 上的 coreos 上的三个 docker 容器(在三个主机上)中。
我尝试使用以下解决方案备份 etcd:
docker run --rm --net=host -v /tmp:/etcd_backup -e ETCDCTL_API=3 quay.io/coreos/etcd:v3.1.5 etcdctl --endpoints=[1.1.1.1:2379,2.2.2.2:2379,3.3.3.3:2379] snapshot save etcd_backup/snapshot.db
备份已成功完成。
我想在另一个 vmware 环境中从零开始创建这个 kubernetes 集群,但我需要为此从快照恢复 etcd。
到目前为止,我还没有找到适用于 docker 容器中 etcd 的正确解决方案。
我尝试用下面的方法恢复,可惜没有成功
首先,我在 运行 执行以下命令后创建了一个新的 etcd 节点:
docker run --rm --net=host -v /tmp/etcd_bak:/etcd_backup -e ETCDCTL_API=3 registry:5000/quay.io/coreos/etcd:v3.1.5 etcdctl snapshot restore etcd_backup/snapshot.db --name etcd0 --initial-cluster etcd0=http://etcd0:2380,etcd1=http://etcd1:2380,etcd2=http://etcd2:2380 --initial-cluster-token etcd-cluster-1 --initial-advertise-peer-urls http://etcd0:2380
结果:
2018-06-04 09:25:52.314747 I | etcdserver/membership: added member 7ff5c9c6942f82e [http://etcd0:2380] to cluster 5d1b637f4b7740d5
2018-06-04 09:25:52.314940 I | etcdserver/membership: added member 91b417e7701c2eeb [http://etcd2:2380] to cluster 5d1b637f4b7740d5
2018-06-04 09:25:52.315096 I | etcdserver/membership: added member faeb78734ee4a93d [http://etcd1:2380] to cluster 5d1b637f4b7740d5
不幸的是,没有任何反应。
恢复etcd备份有什么好的解决方案?
如何创建一个空的 etcd cluster/node 以及我应该如何恢复快照?
根据 Etcd Disaster Recovery 文档,您需要使用像您这样的命令从快照中恢复所有三个 etcd 节点,然后 运行 使用这样的命令恢复三个节点:
etcd \
--name m1 \
--listen-client-urls http://host1:2379 \
--advertise-client-urls http://host1:2379 \
--listen-peer-urls http://host1:2380 &
另外,你可以从镜像中提取etcdctl,像这样:
docker run --rm -v /opt/bin:/opt/bin registry:5000/quay.io/coreos/etcd:v3.1.5 cp /usr/local/bin/etcdctl /opt/bin
然后使用etcdctl恢复快照:
# ETCDCTL_API=3 ./etcdctl snapshot restore snapshot.db \
--name m1 \
--initial-cluster m1=http://host1:2380,m2=http://host2:2380,m3=http://host3:2380 \
--initial-cluster-token etcd-cluster-1 \
--initial-advertise-peer-urls http://host1:2380 \
--data-dir /var/lib/etcd
这会将快照还原到 /var/lib/etcd 目录。然后使用 docker 启动 etcd,不要忘记将 /var/lib/etcd 挂载到您的容器中,并为其指定 --data-dir .
kubernetes 中的 Ectd 是 Docker 容器中的 运行,这是我恢复集群所做的工作:
检索 Etcd 集群元数据
docker inspect etcd1
你会得到类似下面的东西:
"Binds": [ "/etc/ssl/certs:/etc/ssl/certs:ro", "/etc/ssl/etcd/ssl:/etc/ssl/etcd/ssl:ro", "/var/lib/etcd:/var/lib/etcd:rw" ], ... "Env": [ "ETCD_DATA_DIR=/var/lib/etcd", "ETCD_ADVERTISE_CLIENT_URLS=https://172.16.60.1:2379", "ETCD_INITIAL_ADVERTISE_PEER_URLS=https://172.16.60.1:2380", "ETCD_INITIAL_CLUSTER_STATE=existing", "ETCD_METRICS=basic", "ETCD_LISTEN_CLIENT_URLS=https://172.16.60.1:2379,https://127.0.0.1:2379", "ETCD_ELECTION_TIMEOUT=5000", "ETCD_HEARTBEAT_INTERVAL=250", "ETCD_INITIAL_CLUSTER_TOKEN=k8s_etcd", "ETCD_LISTEN_PEER_URLS=https://172.16.60.1:2380", "ETCD_NAME=etcd1", "ETCD_PROXY=off", "ETCD_INITIAL_CLUSTER=etcd1=https://172.16.60.1:2380,etcd2=https://172.16.60.2:2380,etcd3=https://172.16.60.2:2380", "ETCD_AUTO_COMPACTION_RETENTION=8", "ETCD_TRUSTED_CA_FILE=/etc/ssl/etcd/ssl/ca.pem", "ETCD_CERT_FILE=/etc/ssl/etcd/ssl/member-node01.pem", "ETCD_KEY_FILE=/etc/ssl/etcd/ssl/member-node01-key.pem", "ETCD_PEER_TRUSTED_CA_FILE=/etc/ssl/etcd/ssl/ca.pem", "ETCD_PEER_CERT_FILE=/etc/ssl/etcd/ssl/member-node01.pem", "ETCD_PEER_KEY_FILE=/etc/ssl/etcd/ssl/member-node01-key.pem", "ETCD_PEER_CLIENT_CERT_AUTH=true", "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin" ], "Cmd": [ "/usr/local/bin/etcd" ],
复制etcd snapshotdb到其他etcd节点
scp snapshotdb_20180913 node02:/root/ scp snapshotdb_20180913 node03:/root/
使用原始信息重建新集群
# etcd1 docker stop etcd1 rm -rf /var/lib/etcd ETCDCTL_API=3 etcdctl snapshot restore snapshotdb_20180913 \ --cacert /etc/ssl/etcd/ssl/ca.pem \ --cert /etc/ssl/etcd/ssl/member-node01.pem \ --key /etc/ssl/etcd/ssl/member-node01-key.pem \ --name etcd1 \ --initial-cluster etcd1=https://node01:2380,etcd2=https://node02:2380,etcd3=https://node03:2380 \ --initial-cluster-token k8s_etcd \ --initial-advertise-peer-urls https://node01:2380 \ --data-dir /var/lib/etcd # etcd2 docker stop etcd2 rm -rf /var/lib/etcd ETCDCTL_API=3 etcdctl snapshot restore snapshotdb_20180913 \ --cacert /etc/ssl/etcd/ssl/ca.pem \ --cert /etc/ssl/etcd/ssl/member-node02.pem \ --key /etc/ssl/etcd/ssl/member-node02-key.pem \ --name etcd2 \ --initial-cluster etcd1=https://node01:2380,etcd2=https://node02:2380,etcd3=https://node03:2380 \ --initial-cluster-token k8s_etcd \ --initial-advertise-peer-urls https://node02:2380 \ --data-dir /var/lib/etcd # etcd3 docker stop etcd3 rm -rf /var/lib/etcd ETCDCTL_API=3 etcdctl snapshot restore snapshotdb_20180913 \ --cacert /etc/ssl/etcd/ssl/ca.pem \ --cert /etc/ssl/etcd/ssl/member-node03.pem \ --key /etc/ssl/etcd/ssl/member-node03-key.pem \ --name etcd3 \ --initial-cluster etcd1=https://node01:2380,etcd2=https://node02:2380,etcd3=https://node03:2380 \ --initial-cluster-token k8s_etcd \ --initial-advertise-peer-urls https://node03:2380 \ --data-dir /var/lib/etcd
启动容器并检查集群状态
cd /etc/ssl/etcd/ssl etcdctl \ --endpoints=https://node01:2379 \ --ca-file=./ca.pem \ --cert-file=./member-node01.pem \ --key-file=./member-node01-key.pem \ member list