如何修复“无法从种子主机 [192.168.23.165] 中找到领导者灵气”
How to fix 'Could not find leader nimbus from seed hosts [192.168.23.165]
基本setting:three虚拟机是192.168.23.165,192.168.23.166和192.168.23.172,我运行zookeeper在165的机器上是standlone的模式,运行在三台机器上是storm。三台机器的fireware都是closed.zookeeper,storm版本分别是3.4.14和1.2.3
我的操作:
首先,我在165的机器上启动了zookeeper。
其次,我在165的机器上启动了storm nimbus,在166和172的机器上启动了storm supervisor。
三、storm拓扑在165的机器上提交
问题1:可以提交topology成功,但是用jps -l命令查看166和172的机器没有创建worker进程。我检查了 166 的 supervisor.log 和 172 的机器一样。
问题2:当我在运行宁主管的机器上再次使用jps -l命令时,主管进程会无故停止。
supervisor.log
2019-09-29 16:55:41.076 o.a.s.u.NimbusClient Async Localizer [WARN] Ignoring exception while trying to get leader nimbus info from 192.168.23.165. will retry with a different seed host.
java.lang.RuntimeException: java.lang.RuntimeException: org.apache.storm.thrift.transport.TTransportException: java.net.ConnectException: 拒绝连接 (Connection refused)
at org.apache.storm.security.auth.ThriftClient.reconnect(ThriftClient.java:112) ~[storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.security.auth.ThriftClient.<init>(ThriftClient.java:73) ~[storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.utils.NimbusClient.<init>(NimbusClient.java:136) ~[storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.utils.NimbusClient.getConfiguredClientAs(NimbusClient.java:103) [storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.utils.NimbusClient.getConfiguredClientAs(NimbusClient.java:66) [storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.utils.NimbusClient.getConfiguredClient(NimbusClient.java:58) [storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.blobstore.NimbusBlobStore.prepare(NimbusBlobStore.java:268) [storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.utils.Utils.getClientBlobStoreForSupervisor(Utils.java:540) [storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.localizer.AsyncLocalizer$DownloadBaseBlobsDistributed.downloadBaseBlobs(AsyncLocalizer.java:121) [storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.localizer.AsyncLocalizer$DownloadBaseBlobsDistributed.call(AsyncLocalizer.java:148) [storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.localizer.AsyncLocalizer$DownloadBaseBlobsDistributed.call(AsyncLocalizer.java:101) [storm-core-1.2.3.jar:1.2.3]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_111]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_111]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_111]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_111]
Caused by: java.lang.RuntimeException: org.apache.storm.thrift.transport.TTransportException: java.net.ConnectException: 拒绝连接 (Connection refused)
at org.apache.storm.security.auth.TBackoffConnect.retryNext(TBackoffConnect.java:64) ~[storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.security.auth.TBackoffConnect.doConnectWithRetry(TBackoffConnect.java:56) ~[storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.security.auth.ThriftClient.reconnect(ThriftClient.java:104) ~[storm-core-1.2.3.jar:1.2.3]
... 14 more
Caused by: org.apache.storm.thrift.transport.TTransportException: java.net.ConnectException: 拒绝连接 (Connection refused)
at org.apache.storm.thrift.transport.TSocket.open(TSocket.java:226) ~[storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.thrift.transport.TFramedTransport.open(TFramedTransport.java:82) ~[storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.security.auth.SimpleTransportPlugin.connect(SimpleTransportPlugin.java:105) ~[storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.security.auth.TBackoffConnect.doConnectWithRetry(TBackoffConnect.java:53) ~[storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.security.auth.ThriftClient.reconnect(ThriftClient.java:104) ~[storm-core-1.2.3.jar:1.2.3]
... 14 more
Caused by: java.net.ConnectException: 拒绝连接 (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method) ~[?:1.8.0_111]
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) ~[?:1.8.0_111]
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) ~[?:1.8.0_111]
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) ~[?:1.8.0_111]
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) ~[?:1.8.0_111]
at java.net.Socket.connect(Socket.java:589) ~[?:1.8.0_111]
at org.apache.storm.thrift.transport.TSocket.open(TSocket.java:221) ~[storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.thrift.transport.TFramedTransport.open(TFramedTransport.java:82) ~[storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.security.auth.SimpleTransportPlugin.connect(SimpleTransportPlugin.java:105) ~[storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.security.auth.TBackoffConnect.doConnectWithRetry(TBackoffConnect.java:53) ~[storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.security.auth.ThriftClient.reconnect(ThriftClient.java:104) ~[storm-core-1.2.3.jar:1.2.3]
... 14 more
2019-09-29 16:55:41.083 o.a.s.l.AsyncLocalizer Async Localizer [WARN] Failed to download basic resources for topology-id RandomStringTopologyLocal-1-1569747324
2019-09-29 16:55:41.083 o.a.s.d.s.AdvancedFSOps Async Localizer [INFO] Deleting path /opt/storm/data/supervisor/tmp/37b4a240-736b-40e8-a3a7-e3933fc2105c
2019-09-29 16:55:41.085 o.a.s.d.s.AdvancedFSOps Async Localizer [INFO] Deleting path /opt/storm/data/supervisor/stormdist/RandomStringTopologyLocal-1-1569747324
2019-09-29 16:55:41.086 o.a.s.l.AsyncLocalizer Async Localizer [WARN] Caught Exception While Downloading (rethrowing)...
org.apache.storm.utils.NimbusLeaderNotFoundException: Could not find leader nimbus from seed hosts [192.168.23.165]. Did you specify a valid list of nimbus hosts for config nimbus.seeds?
at org.apache.storm.utils.NimbusClient.getConfiguredClientAs(NimbusClient.java:120) ~[storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.utils.NimbusClient.getConfiguredClientAs(NimbusClient.java:66) ~[storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.utils.NimbusClient.getConfiguredClient(NimbusClient.java:58) ~[storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.blobstore.NimbusBlobStore.prepare(NimbusBlobStore.java:268) ~[storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.utils.Utils.getClientBlobStoreForSupervisor(Utils.java:540) ~[storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.localizer.AsyncLocalizer$DownloadBaseBlobsDistributed.downloadBaseBlobs(AsyncLocalizer.java:121) ~[storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.localizer.AsyncLocalizer$DownloadBaseBlobsDistributed.call(AsyncLocalizer.java:148) [storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.localizer.AsyncLocalizer$DownloadBaseBlobsDistributed.call(AsyncLocalizer.java:101) [storm-core-1.2.3.jar:1.2.3]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_111]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_111]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_111]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_111]
特别是,当我在 165 的机器上 运行 supervisor 时,这意味着 supervisor 和 nimbus 在同一台机器上 运行 运行 zookeeper too.I 再次提交拓扑,可以创建工作进程,一切正常
Zookeeper 的配置如下:
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just
# example sakes.
dataDir=/opt/zookeeper/data
logDir=/opt/zookeeper/log
# the port at which the clients will connect
clientPort=2181
# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance section of the
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1
Storm 的配置如下:
########### These MUST be filled in for a storm configuration
storm.zookeeper.servers:
- "192.168.23.165"
# - "server1"
# - "server2"
#
nimbus.seeds: ["192.168.23.165"]
#
storm.local.dir: "/opt/storm/data"
supervisor.slots.ports:
- 6700
- 6701
- 6702
- 6703
这可能是个显而易见的问题,但您是否检查过您的 166/167 机器是否可以通过端口 6627 连接到 192.168.23.165?
基本setting:three虚拟机是192.168.23.165,192.168.23.166和192.168.23.172,我运行zookeeper在165的机器上是standlone的模式,运行在三台机器上是storm。三台机器的fireware都是closed.zookeeper,storm版本分别是3.4.14和1.2.3
我的操作: 首先,我在165的机器上启动了zookeeper。 其次,我在165的机器上启动了storm nimbus,在166和172的机器上启动了storm supervisor。 三、storm拓扑在165的机器上提交
问题1:可以提交topology成功,但是用jps -l命令查看166和172的机器没有创建worker进程。我检查了 166 的 supervisor.log 和 172 的机器一样。
问题2:当我在运行宁主管的机器上再次使用jps -l命令时,主管进程会无故停止。
supervisor.log
2019-09-29 16:55:41.076 o.a.s.u.NimbusClient Async Localizer [WARN] Ignoring exception while trying to get leader nimbus info from 192.168.23.165. will retry with a different seed host.
java.lang.RuntimeException: java.lang.RuntimeException: org.apache.storm.thrift.transport.TTransportException: java.net.ConnectException: 拒绝连接 (Connection refused)
at org.apache.storm.security.auth.ThriftClient.reconnect(ThriftClient.java:112) ~[storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.security.auth.ThriftClient.<init>(ThriftClient.java:73) ~[storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.utils.NimbusClient.<init>(NimbusClient.java:136) ~[storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.utils.NimbusClient.getConfiguredClientAs(NimbusClient.java:103) [storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.utils.NimbusClient.getConfiguredClientAs(NimbusClient.java:66) [storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.utils.NimbusClient.getConfiguredClient(NimbusClient.java:58) [storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.blobstore.NimbusBlobStore.prepare(NimbusBlobStore.java:268) [storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.utils.Utils.getClientBlobStoreForSupervisor(Utils.java:540) [storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.localizer.AsyncLocalizer$DownloadBaseBlobsDistributed.downloadBaseBlobs(AsyncLocalizer.java:121) [storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.localizer.AsyncLocalizer$DownloadBaseBlobsDistributed.call(AsyncLocalizer.java:148) [storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.localizer.AsyncLocalizer$DownloadBaseBlobsDistributed.call(AsyncLocalizer.java:101) [storm-core-1.2.3.jar:1.2.3]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_111]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_111]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_111]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_111]
Caused by: java.lang.RuntimeException: org.apache.storm.thrift.transport.TTransportException: java.net.ConnectException: 拒绝连接 (Connection refused)
at org.apache.storm.security.auth.TBackoffConnect.retryNext(TBackoffConnect.java:64) ~[storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.security.auth.TBackoffConnect.doConnectWithRetry(TBackoffConnect.java:56) ~[storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.security.auth.ThriftClient.reconnect(ThriftClient.java:104) ~[storm-core-1.2.3.jar:1.2.3]
... 14 more
Caused by: org.apache.storm.thrift.transport.TTransportException: java.net.ConnectException: 拒绝连接 (Connection refused)
at org.apache.storm.thrift.transport.TSocket.open(TSocket.java:226) ~[storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.thrift.transport.TFramedTransport.open(TFramedTransport.java:82) ~[storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.security.auth.SimpleTransportPlugin.connect(SimpleTransportPlugin.java:105) ~[storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.security.auth.TBackoffConnect.doConnectWithRetry(TBackoffConnect.java:53) ~[storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.security.auth.ThriftClient.reconnect(ThriftClient.java:104) ~[storm-core-1.2.3.jar:1.2.3]
... 14 more
Caused by: java.net.ConnectException: 拒绝连接 (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method) ~[?:1.8.0_111]
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) ~[?:1.8.0_111]
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) ~[?:1.8.0_111]
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) ~[?:1.8.0_111]
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) ~[?:1.8.0_111]
at java.net.Socket.connect(Socket.java:589) ~[?:1.8.0_111]
at org.apache.storm.thrift.transport.TSocket.open(TSocket.java:221) ~[storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.thrift.transport.TFramedTransport.open(TFramedTransport.java:82) ~[storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.security.auth.SimpleTransportPlugin.connect(SimpleTransportPlugin.java:105) ~[storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.security.auth.TBackoffConnect.doConnectWithRetry(TBackoffConnect.java:53) ~[storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.security.auth.ThriftClient.reconnect(ThriftClient.java:104) ~[storm-core-1.2.3.jar:1.2.3]
... 14 more
2019-09-29 16:55:41.083 o.a.s.l.AsyncLocalizer Async Localizer [WARN] Failed to download basic resources for topology-id RandomStringTopologyLocal-1-1569747324
2019-09-29 16:55:41.083 o.a.s.d.s.AdvancedFSOps Async Localizer [INFO] Deleting path /opt/storm/data/supervisor/tmp/37b4a240-736b-40e8-a3a7-e3933fc2105c
2019-09-29 16:55:41.085 o.a.s.d.s.AdvancedFSOps Async Localizer [INFO] Deleting path /opt/storm/data/supervisor/stormdist/RandomStringTopologyLocal-1-1569747324
2019-09-29 16:55:41.086 o.a.s.l.AsyncLocalizer Async Localizer [WARN] Caught Exception While Downloading (rethrowing)...
org.apache.storm.utils.NimbusLeaderNotFoundException: Could not find leader nimbus from seed hosts [192.168.23.165]. Did you specify a valid list of nimbus hosts for config nimbus.seeds?
at org.apache.storm.utils.NimbusClient.getConfiguredClientAs(NimbusClient.java:120) ~[storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.utils.NimbusClient.getConfiguredClientAs(NimbusClient.java:66) ~[storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.utils.NimbusClient.getConfiguredClient(NimbusClient.java:58) ~[storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.blobstore.NimbusBlobStore.prepare(NimbusBlobStore.java:268) ~[storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.utils.Utils.getClientBlobStoreForSupervisor(Utils.java:540) ~[storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.localizer.AsyncLocalizer$DownloadBaseBlobsDistributed.downloadBaseBlobs(AsyncLocalizer.java:121) ~[storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.localizer.AsyncLocalizer$DownloadBaseBlobsDistributed.call(AsyncLocalizer.java:148) [storm-core-1.2.3.jar:1.2.3]
at org.apache.storm.localizer.AsyncLocalizer$DownloadBaseBlobsDistributed.call(AsyncLocalizer.java:101) [storm-core-1.2.3.jar:1.2.3]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_111]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_111]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_111]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_111]
特别是,当我在 165 的机器上 运行 supervisor 时,这意味着 supervisor 和 nimbus 在同一台机器上 运行 运行 zookeeper too.I 再次提交拓扑,可以创建工作进程,一切正常
Zookeeper 的配置如下:
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just
# example sakes.
dataDir=/opt/zookeeper/data
logDir=/opt/zookeeper/log
# the port at which the clients will connect
clientPort=2181
# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance section of the
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1
Storm 的配置如下:
########### These MUST be filled in for a storm configuration
storm.zookeeper.servers:
- "192.168.23.165"
# - "server1"
# - "server2"
#
nimbus.seeds: ["192.168.23.165"]
#
storm.local.dir: "/opt/storm/data"
supervisor.slots.ports:
- 6700
- 6701
- 6702
- 6703
这可能是个显而易见的问题,但您是否检查过您的 166/167 机器是否可以通过端口 6627 连接到 192.168.23.165?