提交拓扑后命令状态停止"Creating job WordCountTopology"
The command status stop "Creating job WordCountTopology" after submit a Topology
我尝试使用 Apache Mesos、Apache Aurora、ZooKeeper 和 HDFS 构建 Heron 集群。但是,当我在完成后提交 WordCountTopology 时,命令输出如下: Stopping the "Creating job WordCountTopology".
yitian@ubuntu:~/.heron/conf/aurora$ heron submit aurora/yitian/devel --config-path ~/.heron/conf ~/.heron/examples/heron-api-examples.jar com.twitter.heron.examples.api.WordCountTopology WordCountTopology
[2018-02-13 06:58:30 +0000] [INFO]: Using cluster definition in /home/yitian/.heron/conf/aurora
[2018-02-13 06:58:30 +0000] [INFO]: Launching topology: 'WordCountTopology'
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/yitian/.heron/lib/uploader/heron-dlog-uploader.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/yitian/.heron/lib/statemgr/heron-zookeeper-statemgr.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.JDK14LoggerFactory]
[2018-02-13 06:58:31 -0800] [INFO] com.twitter.heron.statemgr.zookeeper.curator.CuratorStateManager: Starting Curator client connecting to: heron01:2181
[2018-02-13 06:58:31 -0800] [INFO] org.apache.curator.framework.imps.CuratorFrameworkImpl: Starting
[2018-02-13 06:58:31 -0800] [INFO] org.apache.curator.framework.state.ConnectionStateManager: State change: CONNECTED
[2018-02-13 06:58:31 -0800] [INFO] com.twitter.heron.statemgr.zookeeper.curator.CuratorStateManager: Directory tree initialized.
[2018-02-13 06:58:31 -0800] [INFO] com.twitter.heron.statemgr.zookeeper.curator.CuratorStateManager: Checking existence of path: /home/yitian/heron/state/topologies/WordCountTopology
[2018-02-13 06:58:34 -0800] [INFO] com.twitter.heron.uploader.hdfs.HdfsUploader: The destination directory does not exist. Creating it now at URI '/home/yitian/heron/topologies/aurora'
[2018-02-13 06:58:37 -0800] [INFO] com.twitter.heron.uploader.hdfs.HdfsUploader: Uploading topology package at '/tmp/tmpvYzRv7/topology.tar.gz' to target HDFS at '/home/yitian/heron/topologies/aurora/WordCountTopology-yitian-tag-0--8268125700662472072.tar.gz'
[2018-02-13 06:58:41 -0800] [INFO] com.twitter.heron.statemgr.zookeeper.curator.CuratorStateManager: Created node for path: /home/yitian/heron/state/topologies/WordCountTopology
[2018-02-13 06:58:41 -0800] [INFO] com.twitter.heron.statemgr.zookeeper.curator.CuratorStateManager: Created node for path: /home/yitian/heron/state/packingplans/WordCountTopology
[2018-02-13 06:58:41 -0800] [INFO] com.twitter.heron.statemgr.zookeeper.curator.CuratorStateManager: Created node for path: /home/yitian/heron/state/executionstate/WordCountTopology
[2018-02-13 06:58:41 -0800] [INFO] com.twitter.heron.scheduler.aurora.AuroraLauncher: Launching topology in aurora
[2018-02-13 06:58:41 -0800] [INFO] com.twitter.heron.scheduler.utils.SchedulerUtils: Updating scheduled-resource in packing plan: WordCountTopology
[2018-02-13 06:58:41 -0800] [INFO] com.twitter.heron.statemgr.zookeeper.curator.CuratorStateManager: Deleted node for path: /home/yitian/heron/state/packingplans/WordCountTopology
[2018-02-13 06:58:41 -0800] [INFO] com.twitter.heron.statemgr.zookeeper.curator.CuratorStateManager: Created node for path: /home/yitian/heron/state/packingplans/WordCountTopology
INFO] Creating job WordCountTopology
Heron Tracker 显示:
status "success"
executiontime 0.00007081031799316406
message ""
version "0.17.1"
result {}
Heron UI 什么也没显示:
Aurora 调度程序 运行 作为:
此外,它在集群中有两个主机。
- master名为heron01,运行Mesos Master,zookeeper和Aurora Scheduler。
- slave名为heron02,运行Mesos slave,Aurora Observer和Executor。
我可以使用网站打开 Observer(heron02:1338
) 和 Executor(heron02:5051
)。我不知道我在哪里犯了错误。集群配置太复杂了,我不能在这里完全展示。你可以看到我的网站关于集群配置。很抱歉,我的网站是中文的,但我相信您可以理解网站中的配置文件内容。该博客是 here
非常感谢您的帮助。
这个问题是集群资源不足造成的。当Aurora Scheduler将实例调度到Heron集群中的worker节点时,如果某个worker节点没有足够的资源分配一个实例,就会导致该实例挂起,等待集群中资源充足的工作节点出现。
于是通过增加Heron集群中worker节点的RAM资源解决了这个问题。
我尝试使用 Apache Mesos、Apache Aurora、ZooKeeper 和 HDFS 构建 Heron 集群。但是,当我在完成后提交 WordCountTopology 时,命令输出如下: Stopping the "Creating job WordCountTopology".
yitian@ubuntu:~/.heron/conf/aurora$ heron submit aurora/yitian/devel --config-path ~/.heron/conf ~/.heron/examples/heron-api-examples.jar com.twitter.heron.examples.api.WordCountTopology WordCountTopology
[2018-02-13 06:58:30 +0000] [INFO]: Using cluster definition in /home/yitian/.heron/conf/aurora
[2018-02-13 06:58:30 +0000] [INFO]: Launching topology: 'WordCountTopology'
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/yitian/.heron/lib/uploader/heron-dlog-uploader.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/yitian/.heron/lib/statemgr/heron-zookeeper-statemgr.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.JDK14LoggerFactory]
[2018-02-13 06:58:31 -0800] [INFO] com.twitter.heron.statemgr.zookeeper.curator.CuratorStateManager: Starting Curator client connecting to: heron01:2181
[2018-02-13 06:58:31 -0800] [INFO] org.apache.curator.framework.imps.CuratorFrameworkImpl: Starting
[2018-02-13 06:58:31 -0800] [INFO] org.apache.curator.framework.state.ConnectionStateManager: State change: CONNECTED
[2018-02-13 06:58:31 -0800] [INFO] com.twitter.heron.statemgr.zookeeper.curator.CuratorStateManager: Directory tree initialized.
[2018-02-13 06:58:31 -0800] [INFO] com.twitter.heron.statemgr.zookeeper.curator.CuratorStateManager: Checking existence of path: /home/yitian/heron/state/topologies/WordCountTopology
[2018-02-13 06:58:34 -0800] [INFO] com.twitter.heron.uploader.hdfs.HdfsUploader: The destination directory does not exist. Creating it now at URI '/home/yitian/heron/topologies/aurora'
[2018-02-13 06:58:37 -0800] [INFO] com.twitter.heron.uploader.hdfs.HdfsUploader: Uploading topology package at '/tmp/tmpvYzRv7/topology.tar.gz' to target HDFS at '/home/yitian/heron/topologies/aurora/WordCountTopology-yitian-tag-0--8268125700662472072.tar.gz'
[2018-02-13 06:58:41 -0800] [INFO] com.twitter.heron.statemgr.zookeeper.curator.CuratorStateManager: Created node for path: /home/yitian/heron/state/topologies/WordCountTopology
[2018-02-13 06:58:41 -0800] [INFO] com.twitter.heron.statemgr.zookeeper.curator.CuratorStateManager: Created node for path: /home/yitian/heron/state/packingplans/WordCountTopology
[2018-02-13 06:58:41 -0800] [INFO] com.twitter.heron.statemgr.zookeeper.curator.CuratorStateManager: Created node for path: /home/yitian/heron/state/executionstate/WordCountTopology
[2018-02-13 06:58:41 -0800] [INFO] com.twitter.heron.scheduler.aurora.AuroraLauncher: Launching topology in aurora
[2018-02-13 06:58:41 -0800] [INFO] com.twitter.heron.scheduler.utils.SchedulerUtils: Updating scheduled-resource in packing plan: WordCountTopology
[2018-02-13 06:58:41 -0800] [INFO] com.twitter.heron.statemgr.zookeeper.curator.CuratorStateManager: Deleted node for path: /home/yitian/heron/state/packingplans/WordCountTopology
[2018-02-13 06:58:41 -0800] [INFO] com.twitter.heron.statemgr.zookeeper.curator.CuratorStateManager: Created node for path: /home/yitian/heron/state/packingplans/WordCountTopology
INFO] Creating job WordCountTopology
Heron Tracker 显示:
status "success"
executiontime 0.00007081031799316406
message ""
version "0.17.1"
result {}
Heron UI 什么也没显示:
Aurora 调度程序 运行 作为:
此外,它在集群中有两个主机。
- master名为heron01,运行Mesos Master,zookeeper和Aurora Scheduler。
- slave名为heron02,运行Mesos slave,Aurora Observer和Executor。
我可以使用网站打开 Observer(heron02:1338
) 和 Executor(heron02:5051
)。我不知道我在哪里犯了错误。集群配置太复杂了,我不能在这里完全展示。你可以看到我的网站关于集群配置。很抱歉,我的网站是中文的,但我相信您可以理解网站中的配置文件内容。该博客是 here
非常感谢您的帮助。
这个问题是集群资源不足造成的。当Aurora Scheduler将实例调度到Heron集群中的worker节点时,如果某个worker节点没有足够的资源分配一个实例,就会导致该实例挂起,等待集群中资源充足的工作节点出现。 于是通过增加Heron集群中worker节点的RAM资源解决了这个问题。