单个 docker-swarm 节点上的 Cassandra 副本
Cassandra replicas on single docker-swarm node
我有一个 docker-swarm 管理器节点 (18.09.6) 运行ning,我正在玩一个 cassandra 集群。我正在使用以下定义,它的工作原理是 seed/master 和 slave 旋转起来并且 communicate/replicate 它们的 data/schema 变化很好:
services:
cassandra-masters:
image: cassandra:2.2
environment:
- MAX_HEAP_SIZE=128m
- HEAP_NEWSIZE=32m
- CASSANDRA_BROADCAST_ADDRESS=cassandra-masters
deploy:
mode: replicated
replicas: 1
cassandra-slaves:
image: cassandra:2.2
environment:
- MAX_HEAP_SIZE=128m
- HEAP_NEWSIZE=32m
- CASSANDRA_SEEDS=cassandra-masters
- CASSANDRA_BROADCAST_ADDRESS=cassandra-slaves
deploy:
mode: replicated
replicas: 1
depends_on:
- cassandra-masters
当我将副本数从 1 更改为 2 时,无论是在部署堆栈还是 post 部署规模时,都会创建 cassandra slave 的第二个任务,但经常失败并显示错误提示无法与种子节点八卦:
INFO 10:51:03 Loading persisted ring state
INFO 10:51:03 Starting Messaging Service on /10.10.0.200:7000 (eth0
INFO 10:51:03 Handshaking version with cassandra-masters/10.10.0.142
Exception (java.lang.RuntimeException) encountered during startup: Unable to gossip with any seeds
java.lang.RuntimeException: Unable to gossip with any seeds
at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1360)
at org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:521)
at org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:756)
at org.apache.cassandra.service.StorageService.initServer(StorageService.java:676)
at org.apache.cassandra.service.StorageService.initServer(StorageService.java:562)
at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:310)
at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:548)
at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:657)
ERROR 10:51:34 Exception encountered during startup
java.lang.RuntimeException: Unable to gossip with any seeds
at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1360) ~[apache-cassandra-2.2.14.jar:2.2.14]
at org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:521) ~[apache-cassandra-2.2.14.jar:2.2.14]
at org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:756) ~[apache-cassandra-2.2.14.jar:2.2.14]
at org.apache.cassandra.service.StorageService.initServer(StorageService.java:676) ~[apache-cassandra-2.2.14.jar:2.2.14]
at org.apache.cassandra.service.StorageService.initServer(StorageService.java:562) ~[apache-cassandra-2.2.14.jar:2.2.14]
at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:310) [apache-cassandra-2.2.14.jar:2.2.14]
at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:548) [apache-cassandra-2.2.14.jar:2.2.14]
at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:657) [apache-cassandra-2.2.14.jar:2.2.14]
我想了解导致问题的原因以及是否有解决方法?我只是在调查在我们显然要在不同节点而不是一个节点上旋转 cassandra tasks/replicas 的生产环境中有什么障碍。
编辑: 我在双节点群上旋转了相同的堆栈,我看到了相同的行为,即当我扩展到第二个 "slave" 任务,它失败并出现相同的错误,因此这不是尝试在同一节点上 运行 两个任务所特有的问题。
我还没有弄清楚八卦失败的原因,但最终我们同意了一个生产部署策略,我们不需要 auto-scaling,而是应该根据系统的行为和预期流量。这个答案还指出了 auto-scaling 可以添加到已经拉伸的系统的额外压力:AWS and auto scaling cassandra
我有一个 docker-swarm 管理器节点 (18.09.6) 运行ning,我正在玩一个 cassandra 集群。我正在使用以下定义,它的工作原理是 seed/master 和 slave 旋转起来并且 communicate/replicate 它们的 data/schema 变化很好:
services:
cassandra-masters:
image: cassandra:2.2
environment:
- MAX_HEAP_SIZE=128m
- HEAP_NEWSIZE=32m
- CASSANDRA_BROADCAST_ADDRESS=cassandra-masters
deploy:
mode: replicated
replicas: 1
cassandra-slaves:
image: cassandra:2.2
environment:
- MAX_HEAP_SIZE=128m
- HEAP_NEWSIZE=32m
- CASSANDRA_SEEDS=cassandra-masters
- CASSANDRA_BROADCAST_ADDRESS=cassandra-slaves
deploy:
mode: replicated
replicas: 1
depends_on:
- cassandra-masters
当我将副本数从 1 更改为 2 时,无论是在部署堆栈还是 post 部署规模时,都会创建 cassandra slave 的第二个任务,但经常失败并显示错误提示无法与种子节点八卦:
INFO 10:51:03 Loading persisted ring state
INFO 10:51:03 Starting Messaging Service on /10.10.0.200:7000 (eth0
INFO 10:51:03 Handshaking version with cassandra-masters/10.10.0.142
Exception (java.lang.RuntimeException) encountered during startup: Unable to gossip with any seeds
java.lang.RuntimeException: Unable to gossip with any seeds
at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1360)
at org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:521)
at org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:756)
at org.apache.cassandra.service.StorageService.initServer(StorageService.java:676)
at org.apache.cassandra.service.StorageService.initServer(StorageService.java:562)
at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:310)
at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:548)
at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:657)
ERROR 10:51:34 Exception encountered during startup
java.lang.RuntimeException: Unable to gossip with any seeds
at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1360) ~[apache-cassandra-2.2.14.jar:2.2.14]
at org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:521) ~[apache-cassandra-2.2.14.jar:2.2.14]
at org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:756) ~[apache-cassandra-2.2.14.jar:2.2.14]
at org.apache.cassandra.service.StorageService.initServer(StorageService.java:676) ~[apache-cassandra-2.2.14.jar:2.2.14]
at org.apache.cassandra.service.StorageService.initServer(StorageService.java:562) ~[apache-cassandra-2.2.14.jar:2.2.14]
at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:310) [apache-cassandra-2.2.14.jar:2.2.14]
at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:548) [apache-cassandra-2.2.14.jar:2.2.14]
at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:657) [apache-cassandra-2.2.14.jar:2.2.14]
我想了解导致问题的原因以及是否有解决方法?我只是在调查在我们显然要在不同节点而不是一个节点上旋转 cassandra tasks/replicas 的生产环境中有什么障碍。
编辑: 我在双节点群上旋转了相同的堆栈,我看到了相同的行为,即当我扩展到第二个 "slave" 任务,它失败并出现相同的错误,因此这不是尝试在同一节点上 运行 两个任务所特有的问题。
我还没有弄清楚八卦失败的原因,但最终我们同意了一个生产部署策略,我们不需要 auto-scaling,而是应该根据系统的行为和预期流量。这个答案还指出了 auto-scaling 可以添加到已经拉伸的系统的额外压力:AWS and auto scaling cassandra