EC2 Cassandra 集群节点上的 CQLSH 连接被拒绝

Question

我正在尝试在四个 EC2 t2.2xlarge 节点上设置一个 Cassandra 集群，其中一个节点被指定为种子。集群似乎已在每个节点上启动。但是，当我尝试运行 /opt/cassandra/bin/cqlsh 时，出现以下错误：

Connection error: ('Unable to connect to any servers', {'127.0.0.1': error(111, "Tried connecting to [('127.0.0.1', 9042)]. Last error: Connection refused")})

当我在种子节点的 9042 上执行 netstat 时，我得到以下输出：

Proto  Recv-Q  Send-Q  Local Address                Foreign Address             State
tcp         0       0  ip-172-xx-xx-111.eu-wes:9042  *:*                         LISTEN

我认为这个主机地址可能是问题的根源，但不知道它是如何设置的，或者如何更改它。应该是 127.0.0.1 还是 localhost?

我有一个安全组设置，其中包含端口 9042 的以下信息：

Type               Protocol    Port Range    Source
-------------------------------------------------------------------------
Custom TCP Rule    TCP         9042          sg-<group-id> (<group-name>)

也许这里的来源有问题？这应该是 localhost 还是什么？

以下是我在每个节点上更改的 cassandra.yaml 中的值：

listen_interface: eth0

broadcast_address: <local-PRIVATE-ip>

rpc_address: <local-PRIVATE-ip>

seed_provider:
# Addresses of hosts that are deemed contact points.
# Cassandra nodes use this list of hosts to find each other and learn
# the topology of the ring.  You must change this if you are running
# multiple nodes!
- class_name: org.apache.cassandra.locator.SimpleSeedProvider
  parameters:
      # seeds is actually a comma-delimited list of addresses.
      # Ex: "<ip1>,<ip2>,<ip3>"
      - seeds: "<seed-node-PRIVATE-ip>"

当我启动每个节点时，日志中的最终消息是：

INFO  11:32:44 Node /172.xx.xx.222 state jump to NORMAL
INFO  11:32:44 Waiting for gossip to settle before accepting client requests...
INFO  11:32:44 Compacted 4 sstables to [/var/lib/cassandra/data/system/local-7ad54392bcdd35a684174e047860b377/system-local-ka-13,].  11,190 bytes to 5,773 (~51% of original) in 24ms = 0.229398MB/s.  4 total partitions merged to 1.  Partition merge counts were {4:1, }
INFO  11:32:52 No gossip backlog; proceeding

种子节点日志的最后几行是：

INFO  11:58:35 Enqueuing flush of local: 578 (0%) on-heap, 0 (0%) off-heap
INFO  11:58:35 Writing Memtable-local@1006553205(0.081KiB serialized bytes, 4 ops, 0%/0% of on/off-heap limit)
INFO  11:58:35 Completed flushing /var/lib/cassandra/data/system/local-7ad54392bcdd35a684174e047860b377/system-local-tmp-ka-14-Data.db (0.000KiB) for commitlog position ReplayPosition(segmentId=1550836714360, position=94125)
INFO  11:58:35 Handshaking version with /172.xx.xx.222
INFO  11:58:35 Node /172.xx.xx.333 has restarted, now UP
INFO  11:58:35 Handshaking version with /172.xx.xx.333
INFO  11:58:35 Node /172.xx.xx.333 state jump to NORMAL
INFO  11:58:35 Enqueuing flush of local: 51462 (0%) on-heap, 0 (0%) off-heap
INFO  11:58:35 Writing Memtable-local@961534831(8.349KiB serialized bytes, 259 ops, 0%/0% of on/off-heap limit)
INFO  11:58:35 Completed flushing /var/lib/cassandra/data/system/local-7ad54392bcdd35a684174e047860b377/system-local-tmp-ka-15-Data.db (0.000KiB) for commitlog position ReplayPosition(segmentId=1550836714360, position=106779)
INFO  11:58:35 InetAddress /172.xx.xx.333 is now UP
INFO  11:58:35 Node /172.xx.xx.111 state jump to NORMAL
INFO  11:58:35 Updating topology for /172.xx.xx.333
INFO  11:58:35 Updating topology for /172.xx.xx.333
INFO  11:58:35 Node /172.xx.xx.444 has restarted, now UP
INFO  11:58:35 Waiting for gossip to settle before accepting client requests...
INFO  11:58:35 Node /172.xx.xx.444 state jump to NORMAL
INFO  11:58:35 Handshaking version with /172.xx.xx.444
INFO  11:58:35 InetAddress /172.xx.xx.444 is now UP
INFO  11:58:35 Updating topology for /172.xx.xx.444
INFO  11:58:35 Updating topology for /172.xx.xx.444
INFO  11:58:35 Node /172.xx.xx.222 has restarted, now UP
INFO  11:58:35 Node /172.xx.xx.222 state jump to NORMAL
INFO  11:58:35 InetAddress /172.xx.xx.222 is now UP
INFO  11:58:35 Updating topology for /172.xx.xx.222
INFO  11:58:35 Updating topology for /172.xx.xx.222
INFO  11:58:38 Updating topology for all endpoints that have changed
INFO  11:58:43 No gossip backlog; proceeding

因此其他非种子节点 (172.xx.xx.222/333/444) 的每个 IP 似乎都被报告为 UP。种子节点 (172.xx.xx.111) 仅报告为 state jump to NORMAL.

Answer 1

Connection error: ('Unable to connect to any servers', {'127.0.0.1': error(111, "Tried connecting to [('127.0.0.1', 9042)]. Last error: Connection refused")})

您似乎正在尝试通过 CQLSH 连接到 127.0.0.1，这在多节点集群中不起作用。使用您的凭据指定确切的（广播）IP，它应该会让您进入。

例如：

$ grep _address conf/cassandra.yaml | grep -v "#"

listen_address: 192.168.1.4
broadcast_address: 10.1.3.6
rpc_address: 192.168.1.4
broadcast_rpc_address: 10.1.3.6

$ bin/cqlsh 10.1.3.6 -u flynn -p reindeerFlotilla

Connected to AaronTest at 10.1.3.6:9042.
[cqlsh 5.0.1 | Cassandra 3.11.2 | CQL spec 3.4.4 | Native protocol v4]
Use HELP for help.
flynn@cqlsh>

EC2 Cassandra 集群节点上的 CQLSH 连接被拒绝

CQLSH connection refused on EC2 Cassandra cluster nodes

amazon-ec2

cassandra

cqlsh

cassandra-cluster