nodetool status: "error: No nodes present in the cluster. Has this node finished starting up?"
nodetool status: "error: No nodes present in the cluster. Has this node finished starting up?"
我正在尝试使用以下节点配置设置一个 2 节点 cassandra-2.1 集群:
Cluster Name: 'Cluster1'
num_tokens: 256
listen_address: 10.20.0.52/10.20.0.53
rpc_address: 10.20.0.52/10.20.0.53
class_name: org.apache.cassandra.locator.SimpleSeedProvider
parameters:
# seeds is actually a comma-delimited list of addresses.
# Ex: "<ip1>,<ip2>,<ip3>"
- seeds: "10.20.0.52"
我首先启动种子节点 (52) 然后,我只检查 52 的 nodetool status
和 returns 数据。但随后我启动 (53) 并且 nodetool status
抛出以下内容几秒钟后出现异常:
-- StackTrace --
java.lang.RuntimeException: No nodes present in the cluster. Has this node finished starting up?
at org.apache.cassandra.dht.Murmur3Partitioner.describeOwnership(Murmur3Partitioner.java:131)
at org.apache.cassandra.service.StorageService.getOwnership(StorageService.java:3912)
at sun.reflect.GeneratedMethodAccessor26.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:71)
at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:275)
at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)
at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)
at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
at com.sun.jmx.mbeanserver.PerInterface.getAttribute(PerInterface.java:83)
at com.sun.jmx.mbeanserver.MBeanSupport.getAttribute(MBeanSupport.java:206)
at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:647)
at com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:678)
at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1443)
at javax.management.remote.rmi.RMIConnectionImpl.access0(RMIConnectionImpl.java:76)
at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1307)
at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1399)
at javax.management.remote.rmi.RMIConnectionImpl.getAttribute(RMIConnectionImpl.java:637)
at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:323)
at sun.rmi.transport.Transport.run(Transport.java:200)
at sun.rmi.transport.Transport.run(Transport.java:197)
at java.security.AccessController.doPrivileged(Native Method)
at sun.rmi.transport.Transport.serviceCall(Transport.java:196)
at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:568)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:826)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run0(TCPTransport.java:683)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler$$Lambda/1165999373.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:682)
但在 non-seed 节点 (53) 上,它 returns 标准输出仅包含其自身 (53) 的详细信息。
nodetool gossipinfo
在种子节点 (52) returns 两个节点的信息:
/10.20.0.52
generation:1439824481
heartbeat:2433
SCHEMA:500091e4-e8ab-303d-9111-8cca7edff2d0
HOST_ID:2d78ed48-13e8-4fc5-ac55-8b2a6d00c8c5
NET_VERSION:8
RELEASE_VERSION:2.1.8-SNAPSHOT
STATUS:NORMAL,-1091407767707699731
RPC_ADDRESS:10.20.0.52
SEVERITY:0.5025125741958618
DC:DC1
LOAD:2524926.0
RACK:RAC1
INTERNAL_IP:10.20.0.52
/10.20.0.53
generation:1439824502
heartbeat:2376
SCHEMA:500091e4-e8ab-303d-9111-8cca7edff2d0
NET_VERSION:8
HOST_ID:2d78ed48-13e8-4fc5-ac55-8b2a6d00c8c5
RELEASE_VERSION:2.1.8-SNAPSHOT
STATUS:NORMAL,-1091407767707699731
RPC_ADDRESS:10.20.0.53
SEVERITY:0.0
DC:DC1
LOAD:2603302.0
RACK:RAC1
INTERNAL_IP:10.20.0.53
但是在non-seed节点上它只显示关于它自己的信息,不包括种子节点(52)。
关于 2 个节点的 state/information 之间的另一个差异是 nodetool netstats
的输出,对于种子节点 (52) 显示:
ubuntu@52:~$ nodetool netstats
Mode: NORMAL
Not sending any streams.
Read Repair Statistics:
Attempted: 0
Mismatch (Blocking): 0
Mismatch (Background): 0
Pool Name Active Pending Completed
Commands n/a 0 0
Responses n/a 0 1135
而对于 non-seed 节点 (53),完成的请求数是种子节点的两倍:
ubuntu@53:~$ nodetool netstats
Mode: NORMAL
Not sending any streams.
Read Repair Statistics:
Attempted: 0
Mismatch (Blocking): 0
Mismatch (Background): 0
Pool Name Active Pending Completed
Commands n/a 0 0
Responses n/a 0 2388
源代码
鉴于堆栈跟踪,我尝试插入一些标志并在调用 describeOwnership
方法时打印导致 L206 Murmur3Partitioner.java
错误的内容:
- 启动种子节点时调用该方法
- 当 non-seed 节点被 bootstrapped
时调用该方法
两次令牌列表(或sortedTokens
)完全相同,但迭代器为空并触发标题中的错误。
注意:两个节点(52,53)上的相关端口(7000,7001)都已打开。
更新 #1: 所以,我发现(感谢 irc #cassandra 频道)如果两个节点具有相同的 tokes 冲突 被创建并且将无法 bootstrap。
为了解决这个问题,我尝试了以下方法:
cqlsh> DROP KEYSPACE ycsb ;
没有解决问题 - nodetool ring
仍然显示与 non-seed 节点对应的相同标记;我也在关闭 cqlsh
后刷新了更改。然后:
sudo rm -rf /var/lib/cassandra/data/*
sudo rm -rf /var/lib/cassandra/commitlog/*
sudo rm -rf /var/lib/cassandra/saved_caches/*
仍然没有减少或更改 nodetool ring
中显示的令牌。
任何指导表示赞赏。
罪魁祸首似乎是端口和防火墙规则不允许节点建立双向对称连接以交换驻留在每个节点上的令牌。采取的故障排除步骤是:
1) nestat -l
在两个节点上查看哪些端口是 open/listening;
2) nmap
从一个节点到另一个节点扫描打开的端口。
3) nodetool ring
比较两个节点上的令牌
4) TRACE
在 logback.xml
中设置的日志记录级别,并在单独的日志文件中输出或输出到 stderr
我还建议您使用#cassandra IRC 频道讨论您的问题。那里的人知识渊博,几乎可以实时提供帮助。
希望对您有所帮助!
我正在尝试使用以下节点配置设置一个 2 节点 cassandra-2.1 集群:
Cluster Name: 'Cluster1'
num_tokens: 256
listen_address: 10.20.0.52/10.20.0.53
rpc_address: 10.20.0.52/10.20.0.53
class_name: org.apache.cassandra.locator.SimpleSeedProvider
parameters:
# seeds is actually a comma-delimited list of addresses.
# Ex: "<ip1>,<ip2>,<ip3>"
- seeds: "10.20.0.52"
我首先启动种子节点 (52) 然后,我只检查 52 的 nodetool status
和 returns 数据。但随后我启动 (53) 并且 nodetool status
抛出以下内容几秒钟后出现异常:
-- StackTrace --
java.lang.RuntimeException: No nodes present in the cluster. Has this node finished starting up?
at org.apache.cassandra.dht.Murmur3Partitioner.describeOwnership(Murmur3Partitioner.java:131)
at org.apache.cassandra.service.StorageService.getOwnership(StorageService.java:3912)
at sun.reflect.GeneratedMethodAccessor26.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at sun.reflect.misc.Trampoline.invoke(MethodUtil.java:71)
at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at sun.reflect.misc.MethodUtil.invoke(MethodUtil.java:275)
at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:112)
at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:46)
at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:237)
at com.sun.jmx.mbeanserver.PerInterface.getAttribute(PerInterface.java:83)
at com.sun.jmx.mbeanserver.MBeanSupport.getAttribute(MBeanSupport.java:206)
at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:647)
at com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:678)
at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1443)
at javax.management.remote.rmi.RMIConnectionImpl.access0(RMIConnectionImpl.java:76)
at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1307)
at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1399)
at javax.management.remote.rmi.RMIConnectionImpl.getAttribute(RMIConnectionImpl.java:637)
at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:323)
at sun.rmi.transport.Transport.run(Transport.java:200)
at sun.rmi.transport.Transport.run(Transport.java:197)
at java.security.AccessController.doPrivileged(Native Method)
at sun.rmi.transport.Transport.serviceCall(Transport.java:196)
at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:568)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:826)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run0(TCPTransport.java:683)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler$$Lambda/1165999373.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:682)
但在 non-seed 节点 (53) 上,它 returns 标准输出仅包含其自身 (53) 的详细信息。
nodetool gossipinfo
在种子节点 (52) returns 两个节点的信息:
/10.20.0.52
generation:1439824481
heartbeat:2433
SCHEMA:500091e4-e8ab-303d-9111-8cca7edff2d0
HOST_ID:2d78ed48-13e8-4fc5-ac55-8b2a6d00c8c5
NET_VERSION:8
RELEASE_VERSION:2.1.8-SNAPSHOT
STATUS:NORMAL,-1091407767707699731
RPC_ADDRESS:10.20.0.52
SEVERITY:0.5025125741958618
DC:DC1
LOAD:2524926.0
RACK:RAC1
INTERNAL_IP:10.20.0.52
/10.20.0.53
generation:1439824502
heartbeat:2376
SCHEMA:500091e4-e8ab-303d-9111-8cca7edff2d0
NET_VERSION:8
HOST_ID:2d78ed48-13e8-4fc5-ac55-8b2a6d00c8c5
RELEASE_VERSION:2.1.8-SNAPSHOT
STATUS:NORMAL,-1091407767707699731
RPC_ADDRESS:10.20.0.53
SEVERITY:0.0
DC:DC1
LOAD:2603302.0
RACK:RAC1
INTERNAL_IP:10.20.0.53
但是在non-seed节点上它只显示关于它自己的信息,不包括种子节点(52)。
关于 2 个节点的 state/information 之间的另一个差异是 nodetool netstats
的输出,对于种子节点 (52) 显示:
ubuntu@52:~$ nodetool netstats
Mode: NORMAL
Not sending any streams.
Read Repair Statistics:
Attempted: 0
Mismatch (Blocking): 0
Mismatch (Background): 0
Pool Name Active Pending Completed
Commands n/a 0 0
Responses n/a 0 1135
而对于 non-seed 节点 (53),完成的请求数是种子节点的两倍:
ubuntu@53:~$ nodetool netstats
Mode: NORMAL
Not sending any streams.
Read Repair Statistics:
Attempted: 0
Mismatch (Blocking): 0
Mismatch (Background): 0
Pool Name Active Pending Completed
Commands n/a 0 0
Responses n/a 0 2388
源代码
鉴于堆栈跟踪,我尝试插入一些标志并在调用 describeOwnership
方法时打印导致 L206 Murmur3Partitioner.java
错误的内容:
- 启动种子节点时调用该方法
- 当 non-seed 节点被 bootstrapped
两次令牌列表(或sortedTokens
)完全相同,但迭代器为空并触发标题中的错误。
注意:两个节点(52,53)上的相关端口(7000,7001)都已打开。
更新 #1: 所以,我发现(感谢 irc #cassandra 频道)如果两个节点具有相同的 tokes 冲突 被创建并且将无法 bootstrap。
为了解决这个问题,我尝试了以下方法: cqlsh> DROP KEYSPACE ycsb ;
没有解决问题 - nodetool ring
仍然显示与 non-seed 节点对应的相同标记;我也在关闭 cqlsh
后刷新了更改。然后:
sudo rm -rf /var/lib/cassandra/data/*
sudo rm -rf /var/lib/cassandra/commitlog/*
sudo rm -rf /var/lib/cassandra/saved_caches/*
仍然没有减少或更改 nodetool ring
中显示的令牌。
任何指导表示赞赏。
罪魁祸首似乎是端口和防火墙规则不允许节点建立双向对称连接以交换驻留在每个节点上的令牌。采取的故障排除步骤是:
1) nestat -l
在两个节点上查看哪些端口是 open/listening;
2) nmap
从一个节点到另一个节点扫描打开的端口。
3) nodetool ring
比较两个节点上的令牌
4) TRACE
在 logback.xml
中设置的日志记录级别,并在单独的日志文件中输出或输出到 stderr
我还建议您使用#cassandra IRC 频道讨论您的问题。那里的人知识渊博,几乎可以实时提供帮助。
希望对您有所帮助!