提交拓扑后命令状态停止"Creating job WordCountTopology"

The command status stop "Creating job WordCountTopology" after submit a Topology

我尝试使用 Apache Mesos、Apache Aurora、ZooKeeper 和 HDFS 构建 Heron 集群。但是,当我在完成后提交 WordCountTopology 时,命令输出如下: Stopping the "Creating job WordCountTopology".

yitian@ubuntu:~/.heron/conf/aurora$ heron submit aurora/yitian/devel --config-path ~/.heron/conf ~/.heron/examples/heron-api-examples.jar com.twitter.heron.examples.api.WordCountTopology WordCountTopology
[2018-02-13 06:58:30 +0000] [INFO]: Using cluster definition in /home/yitian/.heron/conf/aurora
[2018-02-13 06:58:30 +0000] [INFO]: Launching topology: 'WordCountTopology'
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/yitian/.heron/lib/uploader/heron-dlog-uploader.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/yitian/.heron/lib/statemgr/heron-zookeeper-statemgr.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.JDK14LoggerFactory]
[2018-02-13 06:58:31 -0800] [INFO] com.twitter.heron.statemgr.zookeeper.curator.CuratorStateManager: Starting Curator client connecting to: heron01:2181  
[2018-02-13 06:58:31 -0800] [INFO] org.apache.curator.framework.imps.CuratorFrameworkImpl: Starting  
[2018-02-13 06:58:31 -0800] [INFO] org.apache.curator.framework.state.ConnectionStateManager: State change: CONNECTED  
[2018-02-13 06:58:31 -0800] [INFO] com.twitter.heron.statemgr.zookeeper.curator.CuratorStateManager: Directory tree initialized.  
[2018-02-13 06:58:31 -0800] [INFO] com.twitter.heron.statemgr.zookeeper.curator.CuratorStateManager: Checking existence of path: /home/yitian/heron/state/topologies/WordCountTopology  
[2018-02-13 06:58:34 -0800] [INFO] com.twitter.heron.uploader.hdfs.HdfsUploader: The destination directory does not exist. Creating it now at URI '/home/yitian/heron/topologies/aurora'  
[2018-02-13 06:58:37 -0800] [INFO] com.twitter.heron.uploader.hdfs.HdfsUploader: Uploading topology package at '/tmp/tmpvYzRv7/topology.tar.gz' to target HDFS at '/home/yitian/heron/topologies/aurora/WordCountTopology-yitian-tag-0--8268125700662472072.tar.gz'  
[2018-02-13 06:58:41 -0800] [INFO] com.twitter.heron.statemgr.zookeeper.curator.CuratorStateManager: Created node for path: /home/yitian/heron/state/topologies/WordCountTopology  
[2018-02-13 06:58:41 -0800] [INFO] com.twitter.heron.statemgr.zookeeper.curator.CuratorStateManager: Created node for path: /home/yitian/heron/state/packingplans/WordCountTopology  
[2018-02-13 06:58:41 -0800] [INFO] com.twitter.heron.statemgr.zookeeper.curator.CuratorStateManager: Created node for path: /home/yitian/heron/state/executionstate/WordCountTopology  
[2018-02-13 06:58:41 -0800] [INFO] com.twitter.heron.scheduler.aurora.AuroraLauncher: Launching topology in aurora  
[2018-02-13 06:58:41 -0800] [INFO] com.twitter.heron.scheduler.utils.SchedulerUtils: Updating scheduled-resource in packing plan: WordCountTopology  
[2018-02-13 06:58:41 -0800] [INFO] com.twitter.heron.statemgr.zookeeper.curator.CuratorStateManager: Deleted node for path: /home/yitian/heron/state/packingplans/WordCountTopology  
[2018-02-13 06:58:41 -0800] [INFO] com.twitter.heron.statemgr.zookeeper.curator.CuratorStateManager: Created node for path: /home/yitian/heron/state/packingplans/WordCountTopology  
INFO] Creating job WordCountTopology

Heron Tracker 显示:

status  "success"
executiontime   0.00007081031799316406
message ""
version "0.17.1"
result  {}

Heron UI 什么也没显示:

Aurora 调度程序 运行 作为:

此外,它在集群中有两个主机。

  1. master名为heron01,运行Mesos Master,zookeeper和Aurora Scheduler。
  2. slave名为heron02,运行Mesos slave,Aurora Observer和Executor。

我可以使用网站打开 Observer(heron02:1338) 和 Executor(heron02:5051)。我不知道我在哪里犯了错误。集群配置太复杂了,我不能在这里完全展示。你可以看到我的网站关于集群配置。很抱歉,我的网站是中文的,但我相信您可以理解网站中的配置文件内容。该博客是 here 非常感谢您的帮助。

这个问题是集群资源不足造成的。当Aurora Scheduler将实例调度到Heron集群中的worker节点时,如果某个worker节点没有足够的资源分配一个实例,就会导致该实例挂起,等待集群中资源充足的工作节点出现。 于是通过增加Heron集群中worker节点的RAM资源解决了这个问题。