Apache Mesos - Zookeeper 加载失败,无法访问 marathon/attach 个从站。

Apache Mesos - Zookeeper failing on load, cannot access marathon/attach slaves.

我正在设置 Mesos 集群。我们的设置是:

3 个初级盒子(8GB RAM,4 cpu) 3 个 Worker boxes (1gb RAM, 1 cpu)

我手上的配置文件,我看到的都是匹配的、正确的。在 /etc/mesos/zk 我有:

zk://106.133.117.128:2181,zk://153.213.95.171:2181,zk://106.121.34.29:2181/mesos

(我更改了实际的 IP 地址,但在整个引用时将使用这些相同的数字)

我不太确定从这里到哪里去。我已经逐步完成了每个配置。

ID 位于每台计算机上的 /etc/zookeeper/conf/myid 并已正确设置。在每台机器上的 zookeeper conf 配置中,它们也被设置为匹配的 IP 和 ID。

我的法定人数是 2。

IP和hostname分别设置为每台机器的IP。

/etc/marathon/conf/master 中马拉松的配置为:

zk://106.133.117.128:2181,zk://153.213.95.171:2181,zk://106.121.34.29:2181/marathon

日志中的确切错误是:

Log file created at: 2015/10/01 13:56:32
Running on machine: mesos-primary-1
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
I1001 13:56:32.595760  6618 logging.cpp:172] INFO level logging started!
I1001 13:56:32.596060  6618 main.cpp:229] Build: 2015-09-25 19:13:24 by root
I1001 13:56:32.596082  6618 main.cpp:231] Version: 0.24.1
I1001 13:56:32.596094  6618 main.cpp:234] Git tag: 0.24.1
I1001 13:56:32.596106  6618 main.cpp:238] Git SHA: 44873806c2bb55da37e9adbece938274d8cd7c48
I1001 13:56:32.596161  6618 main.cpp:252] Using 'HierarchicalDRF' allocator
I1001 13:56:32.602738  6618 leveldb.cpp:176] Opened db in 6.456045ms
I1001 13:56:32.611217  6618 leveldb.cpp:183] Compacted db in 8.423531ms
I1001 13:56:32.611312  6618 leveldb.cpp:198] Created db iterator in 22068ns
I1001 13:56:32.611348  6618 leveldb.cpp:204] Seeked to beginning of db in 1287ns
I1001 13:56:32.611372  6618 leveldb.cpp:273] Iterated through 0 keys in the db in 376ns
I1001 13:56:32.611448  6618 replica.cpp:744] Replica recovered with log positions 0 -> 0 with 1 holes and 0 unlearned
I1001 13:56:32.647243  6648 log.cpp:238] Attempting to join replica to ZooKeeper group
I1001 13:56:32.689388  6651 recover.cpp:449] Starting replica recovery
I1001 13:56:32.690028  6651 recover.cpp:475] Replica is in EMPTY status
W1001 13:56:32.690147  6649 zookeeper.cpp:101] zookeeper_init failed: Invalid argument ; retrying in 1 second
W1001 13:56:32.690726  6644 zookeeper.cpp:101] zookeeper_init failed: Invalid argument ; retrying in 1 second
W1001 13:56:32.690768  6647 zookeeper.cpp:101] zookeeper_init failed: Invalid argument ; retrying in 1 second
W1001 13:56:32.690821  6645 zookeeper.cpp:101] zookeeper_init failed: Invalid argument ; retrying in 1 second
I1001 13:56:32.690891  6618 main.cpp:465] Starting Mesos master
I1001 13:56:32.691463  6618 master.cpp:378] Master 20151001-135632-2088076136-5050-6618 (104.131.117.124) started on 106.133.117.128:5050
I1001 13:56:32.691494  6618 master.cpp:380] Flags at startup: --allocation_interval="1secs" --allocator="HierarchicalDRF" --authenticate="false" --authenticate_slaves="false" --authenticators="crammd5" --authorizers="local" --framework_sorter="drf" --help="false" --hostname="106.133.117.128" --initialize_driver_logging="true" --ip="106.133.117.128" --log_auto_initialize="true" --log_dir="/var/log/mesos" --logbufsecs="0" --logging_level="INFO" --max_slave_ping_timeouts="5" --port="5050" --quiet="false" --quorum="2" --recovery_slave_removal_limit="100%" --registry="replicated_log" --registry_fetch_timeout="1mins" --registry_store_timeout="5secs" --registry_strict="false" --root_submissions="true" --slave_ping_timeout="15secs" --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" --webui_dir="/usr/share/mesos/webui" --work_dir="/var/lib/mesos" --zk="zk://106.133.117.128:2181,zk://153.213.95.171:2181,zk://106.121.34.29:2181/mesoss" --zk_session_timeout="10secs"
I1001 13:56:32.691671  6618 master.cpp:427] Master allowing unauthenticated frameworks to register
I1001 13:56:32.691700  6618 master.cpp:432] Master allowing unauthenticated slaves to register
I1001 13:56:32.691725  6618 master.cpp:469] Using default 'crammd5' authenticator
W1001 13:56:32.691756  6618 authenticator.cpp:505] No credentials provided, authentication requests will be refused.
I1001 13:56:32.691790  6618 authenticator.cpp:512] Initializing server SASL
I1001 13:56:32.695333  6646 master.cpp:1464] Successfully attached file '/var/log/mesos/mesos-master.INFO'
I1001 13:56:32.695377  6646 contender.cpp:149] Joining the ZK group
W1001 13:56:33.690989  6649 zookeeper.cpp:101] zookeeper_init failed: Invalid argument ; retrying in 1 second
W1001 13:56:33.691220  6644 zookeeper.cpp:101] zookeeper_init failed: Invalid argument ; retrying in 1 second

非常感谢任何意见。

您的 mesos zookeeper 字符串格式错误。它应该是

的形式
zk://host1:port1,host2:port2,host3:port3/path

这应该可以解决您的问题(除非有任何其他配置问题)。

也在 freenode 上的#mesos 上回答。