无法将节点连接到集群
Unable to connect a node to cluster
我有三个 cockroachdb 节点,其中两个与 DigitalOcean(一个在 SF 和 NY)和第三个 TX 服务器。我按照 Manual Deployment 文档和我们的本地节点启动,然后我们的远程节点返回:
*
* WARNING: The server appears to be unable to contact the other nodes in the cluster. Please try
*
* - starting the other nodes, if you haven't already
* - double-checking that the '--join' and '--host' flags are set up correctly
* - not using the '--background' flag.
*
* If problems persist, please see https://www.cockroachlabs.com/docs/v1.0/cluster-setup-troubleshooting.html.
*
我 nmap
从我们的纽约服务器发送到我们的 TX 节点并且端口是打开的。然后我 运行 cockroach start
和 --logtostderr
并注意到它试图解析到本地 IP,即使我告诉它 --join REMOTEIP:PORT
。
I171019 14:17:10.234575 12 cli/start.go:503 starting cockroach node
I171019 14:17:10.237272 12 storage/engine/rocksdb.go:411 opening rocksdb instance at "/root/cockroach-data/local"
W171019 14:17:10.251456 12 gossip/gossip.go:1241 [n?] no incoming or outgoing connections
I171019 14:17:10.251638 12 storage/engine/rocksdb.go:411 opening rocksdb instance at "/root/cockroach-data"
I171019 14:17:10.258098 12 server/config.go:528 [n?] 1 storage engine initialized
I171019 14:17:10.258271 12 server/config.go:530 [n?] RocksDB cache size: 500 MiB
I171019 14:17:10.258347 12 server/config.go:530 [n?] store 0: RocksDB, max size 0 B, max open file limit 10000
I171019 14:17:10.259025 12 server/server.go:837 [n?] no stores bootstrapped and --join flag specified, awaiting init command.
I171019 14:17:10.401973 21 gossip/client.go:129 [n?] started gossip client to 24.153.192.101:26257
I171019 14:17:10.454957 12 storage/stores.go:303 [n?] read 0 node addresses from persistent storage
I171019 14:17:10.455140 12 storage/stores.go:322 [n?] wrote 1 node addresses to persistent storage
I171019 14:17:10.455209 12 server/node.go:606 [n?] connecting to gossip network to verify cluster ID...
I171019 14:17:10.455268 12 server/node.go:631 [n?] node connected via gossip and verified as part of cluster "270f9533-45ef-4ff6-850d-da3160e9b5a6"
I171019 14:17:30.456253 70 vendor/google.golang.org/grpc/grpclog/grpclog.go:75 grpc: addrConn.resetTransport failed to create client transport: connection error: desc = "transport: Error while dialing dial tcp 10.10.11.12:26257: i/o timeout"; Reconnecting to {10.10.11.12:26257 <nil>}
W171019 14:17:40.454698 79 server/server.go:878 The server appears to be unable to contact the other nodes in the cluster. Please try
我是否错误地设置了本地节点主机名? troubleshooting documentation 不是很有帮助。我什至尝试将 TX 主机更改为本地 IP,但并没有解决问题。
编辑:
我们的防火墙导致了通信问题。解决后,我们的 TX 节点需要 --advertise-host
参数。
默认情况下,蟑螂节点将自己的地址公布为--host
的值。
在专用网络中,这会工作得很好,因为地址通常是网络上所有节点的resolvable/reachable。
但是,当节点位于不同的网络中时,您可能需要使用 --advertise-host
.
告诉每个节点其 public IP 地址
您可以在集群故障排除文档中找到更多详细信息:https://www.cockroachlabs.com/docs/stable/cluster-setup-troubleshooting.html#networking-troubleshooting
我有三个 cockroachdb 节点,其中两个与 DigitalOcean(一个在 SF 和 NY)和第三个 TX 服务器。我按照 Manual Deployment 文档和我们的本地节点启动,然后我们的远程节点返回:
*
* WARNING: The server appears to be unable to contact the other nodes in the cluster. Please try
*
* - starting the other nodes, if you haven't already
* - double-checking that the '--join' and '--host' flags are set up correctly
* - not using the '--background' flag.
*
* If problems persist, please see https://www.cockroachlabs.com/docs/v1.0/cluster-setup-troubleshooting.html.
*
我 nmap
从我们的纽约服务器发送到我们的 TX 节点并且端口是打开的。然后我 运行 cockroach start
和 --logtostderr
并注意到它试图解析到本地 IP,即使我告诉它 --join REMOTEIP:PORT
。
I171019 14:17:10.234575 12 cli/start.go:503 starting cockroach node
I171019 14:17:10.237272 12 storage/engine/rocksdb.go:411 opening rocksdb instance at "/root/cockroach-data/local"
W171019 14:17:10.251456 12 gossip/gossip.go:1241 [n?] no incoming or outgoing connections
I171019 14:17:10.251638 12 storage/engine/rocksdb.go:411 opening rocksdb instance at "/root/cockroach-data"
I171019 14:17:10.258098 12 server/config.go:528 [n?] 1 storage engine initialized
I171019 14:17:10.258271 12 server/config.go:530 [n?] RocksDB cache size: 500 MiB
I171019 14:17:10.258347 12 server/config.go:530 [n?] store 0: RocksDB, max size 0 B, max open file limit 10000
I171019 14:17:10.259025 12 server/server.go:837 [n?] no stores bootstrapped and --join flag specified, awaiting init command.
I171019 14:17:10.401973 21 gossip/client.go:129 [n?] started gossip client to 24.153.192.101:26257
I171019 14:17:10.454957 12 storage/stores.go:303 [n?] read 0 node addresses from persistent storage
I171019 14:17:10.455140 12 storage/stores.go:322 [n?] wrote 1 node addresses to persistent storage
I171019 14:17:10.455209 12 server/node.go:606 [n?] connecting to gossip network to verify cluster ID...
I171019 14:17:10.455268 12 server/node.go:631 [n?] node connected via gossip and verified as part of cluster "270f9533-45ef-4ff6-850d-da3160e9b5a6"
I171019 14:17:30.456253 70 vendor/google.golang.org/grpc/grpclog/grpclog.go:75 grpc: addrConn.resetTransport failed to create client transport: connection error: desc = "transport: Error while dialing dial tcp 10.10.11.12:26257: i/o timeout"; Reconnecting to {10.10.11.12:26257 <nil>}
W171019 14:17:40.454698 79 server/server.go:878 The server appears to be unable to contact the other nodes in the cluster. Please try
我是否错误地设置了本地节点主机名? troubleshooting documentation 不是很有帮助。我什至尝试将 TX 主机更改为本地 IP,但并没有解决问题。
编辑:
我们的防火墙导致了通信问题。解决后,我们的 TX 节点需要 --advertise-host
参数。
默认情况下,蟑螂节点将自己的地址公布为--host
的值。
在专用网络中,这会工作得很好,因为地址通常是网络上所有节点的resolvable/reachable。
但是,当节点位于不同的网络中时,您可能需要使用 --advertise-host
.
您可以在集群故障排除文档中找到更多详细信息:https://www.cockroachlabs.com/docs/stable/cluster-setup-troubleshooting.html#networking-troubleshooting