在 Ubuntu 14.04 on Digital Ocean 上使用 Ansible 设置 Mesos
Setting up Mesos with Ansible on Ubuntu 14.04 on Digital Ocean
我一直在关注这个教程How to configure a production ready Mesos cluster and have been creating an ansible playbook along the way which you can see here mesos ansible playbook
Ansible 成功运行,我可以访问主服务器上的端口 5050 并查看 mesos 仪表板。然而,似乎有 3 个问题,希望它们都相互关联,但从表面上看似乎是分开的。
- 在 mesos 仪表板的顶部,它说目前没有大师在领导
- 没有注册奴隶
- 当我访问任何 master 上的端口 8080 时,Marathon 仪表板不工作
关于我做错了什么的任何想法,或者自本教程发布以来是否有任何变化?
编辑:试图深入挖掘。在 运行 ansible 之后,我登录到每个节点并手动重新启动 mesos 和 marathon 服务。当我到达马拉松仪表板时,这似乎起到了作用,然后在对奴隶进行了一些摆弄之后,我可以看到那些也被激活了。不幸的是,在对节点进行核对和重建后,我无法重现。我的设置与我链接的教程和 Celine 链接的教程一致,所以我认为这是我重新启动服务的顺序。仍在寻求帮助
编辑2:
启动时来自其中一位主人的日志副本最后一次 http 调用只是重复和重复
I1014 18:56:32.746968 11494 logging.cpp:172] INFO level logging
started! I1014 18:56:32.748177 11494 main.cpp:229] Build: 2015-10-12
20:57:28 by root I1014 18:56:32.748277 11494 main.cpp:231] Version:
0.25.0 I1014 18:56:32.748345 11494 main.cpp:234] Git tag: 0.25.0 I1014 18:56:32.748406 11494 main.cpp:238] Git SHA:
2dd7f7ee115fe00b8e098b0a10762a4fa8f4600f I1014 18:56:32.748615 11494
main.cpp:252] Using 'HierarchicalDRF' allocator I1014 18:56:32.759768
11494 leveldb.cpp:176] Opened db in 10.929155ms I1014 18:56:32.763638
11494 leveldb.cpp:183] Compacted db in 3.722708ms I1014
18:56:32.763713 11494 leveldb.cpp:198] Created db iterator in 33931ns
I1014 18:56:32.763761 11494 leveldb.cpp:204] Seeked to beginning of db
in 8624ns I1014 18:56:32.764142 11494 leveldb.cpp:273] Iterated
through 1 keys in the db in 352415ns I1014 18:56:32.764263 11494
replica.cpp:744] Replica recovered with log positions 0 -> 0 with 1
holes and 0 unlearned I1014 18:56:32.767266 11520 log.cpp:238]
Attempting to join replica to ZooKeeper group I1014 18:56:32.767493
11520 recover.cpp:449] Starting replica recovery I1014 18:56:32.767623
11520 recover.cpp:475] Replica is in VOTING status I1014
18:56:32.767695 11520 recover.cpp:464] Recover process terminated
I1014 18:56:32.775274 11494 main.cpp:465] Starting Mesos master I1014
18:56:32.779567 11516 master.cpp:376] Master
75abeaaa-a949-45a3-bd85-bebf100eecad (159.203.107.10) started on
159.203.107.10:5050 I1014 18:56:32.779597 11516 master.cpp:378] Flags at startup: --allocation_interval="1secs"
--allocator="HierarchicalDRF" --authenticate="false" --authenticate_slaves="false" --authenticators="crammd5" --authorizers="local" --framework_sorter="drf" --help="false" --hostname="159.203.107.10" --hostname_lookup="true" --initialize_driver_logging="true" --ip="159.203.107.10" --log_auto_initialize="true" --log_dir="/var/log/mesos" --logbufsecs="0" --logging_level="INFO" --max_slave_ping_timeouts="5" --port="5050" --quiet="false" --quorum="1" --recovery_slave_removal_limit="100%" --registry="replicated_log" --registry_fetch_timeout="1mins" --registry_store_timeout="5secs" --registry_strict="false" --root_submissions="true" --slave_ping_timeout="15secs" --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" --webui_dir="/usr/share/mesos/webui" --work_dir="/var/lib/mesos" --zk="zk://159.203.107.10:2181,159.203.107.151:2181,159.203.107.162:2181/mesos"
--zk_session_timeout="10secs" I1014 18:56:32.779762 11516 master.cpp:425] Master allowing unauthenticated frameworks to register
I1014 18:56:32.779770 11516 master.cpp:430] Master allowing
unauthenticated slaves to register I1014 18:56:32.779778 11516
master.cpp:467] Using default 'crammd5' authenticator W1014
18:56:32.779798 11516 authenticator.cpp:505] No credentials provided,
authentication requests will be refused I1014 18:56:32.779906 11516
authenticator.cpp:512] Initializing server SASL I1014 18:56:32.791836
11515 master.cpp:1542] Successfully attached file
'/var/log/mesos/mesos-master.INFO' I1014 18:56:32.792043 11519
contender.cpp:149] Joining the ZK group I1014 18:56:34.968217 11517
http.cpp:336] HTTP GET for /master/state.json from 12.228.115.34:40863
with User-Agent='Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5)
AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101
Safari/537.36' I1014 18:56:45.242039 11518 http.cpp:336] HTTP GET for
/master/state.json from 12.228.115.34:63018 with
User-Agent='Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5)
AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101
Safari/537.36' I1014 18:56:55.319259 11519 http.cpp:336] HTTP GET for
/master/state.json from 12.228.115.34:50024 with
User-Agent='Mozilla/5.0 (Macintosh; Intel Mac OS X 1
谢谢
第一个问题"no masters are currently leading"通常是zookeeper的问题。
检查您的服务器上 运行 是否有 zookeeper。这也可以解释您对 Marathon 和 mesos 奴隶的问题。
此文档似乎是最新的:http://open.mesosphere.com/getting-started/datacenter/install/
这是动物园管理员配置问题。 None 的教程提到除了列出服务器 ips 之外还需要在 zoo.cfg 中设置值。您还需要设置 dataDir、syncLimit、initLimit、tickTime 和 clientPort
我一直在关注这个教程How to configure a production ready Mesos cluster and have been creating an ansible playbook along the way which you can see here mesos ansible playbook
Ansible 成功运行,我可以访问主服务器上的端口 5050 并查看 mesos 仪表板。然而,似乎有 3 个问题,希望它们都相互关联,但从表面上看似乎是分开的。
- 在 mesos 仪表板的顶部,它说目前没有大师在领导
- 没有注册奴隶
- 当我访问任何 master 上的端口 8080 时,Marathon 仪表板不工作
关于我做错了什么的任何想法,或者自本教程发布以来是否有任何变化?
编辑:试图深入挖掘。在 运行 ansible 之后,我登录到每个节点并手动重新启动 mesos 和 marathon 服务。当我到达马拉松仪表板时,这似乎起到了作用,然后在对奴隶进行了一些摆弄之后,我可以看到那些也被激活了。不幸的是,在对节点进行核对和重建后,我无法重现。我的设置与我链接的教程和 Celine 链接的教程一致,所以我认为这是我重新启动服务的顺序。仍在寻求帮助
编辑2: 启动时来自其中一位主人的日志副本最后一次 http 调用只是重复和重复
I1014 18:56:32.746968 11494 logging.cpp:172] INFO level logging started! I1014 18:56:32.748177 11494 main.cpp:229] Build: 2015-10-12 20:57:28 by root I1014 18:56:32.748277 11494 main.cpp:231] Version: 0.25.0 I1014 18:56:32.748345 11494 main.cpp:234] Git tag: 0.25.0 I1014 18:56:32.748406 11494 main.cpp:238] Git SHA: 2dd7f7ee115fe00b8e098b0a10762a4fa8f4600f I1014 18:56:32.748615 11494 main.cpp:252] Using 'HierarchicalDRF' allocator I1014 18:56:32.759768 11494 leveldb.cpp:176] Opened db in 10.929155ms I1014 18:56:32.763638 11494 leveldb.cpp:183] Compacted db in 3.722708ms I1014 18:56:32.763713 11494 leveldb.cpp:198] Created db iterator in 33931ns I1014 18:56:32.763761 11494 leveldb.cpp:204] Seeked to beginning of db in 8624ns I1014 18:56:32.764142 11494 leveldb.cpp:273] Iterated through 1 keys in the db in 352415ns I1014 18:56:32.764263 11494 replica.cpp:744] Replica recovered with log positions 0 -> 0 with 1 holes and 0 unlearned I1014 18:56:32.767266 11520 log.cpp:238] Attempting to join replica to ZooKeeper group I1014 18:56:32.767493 11520 recover.cpp:449] Starting replica recovery I1014 18:56:32.767623 11520 recover.cpp:475] Replica is in VOTING status I1014 18:56:32.767695 11520 recover.cpp:464] Recover process terminated I1014 18:56:32.775274 11494 main.cpp:465] Starting Mesos master I1014 18:56:32.779567 11516 master.cpp:376] Master 75abeaaa-a949-45a3-bd85-bebf100eecad (159.203.107.10) started on 159.203.107.10:5050 I1014 18:56:32.779597 11516 master.cpp:378] Flags at startup: --allocation_interval="1secs" --allocator="HierarchicalDRF" --authenticate="false" --authenticate_slaves="false" --authenticators="crammd5" --authorizers="local" --framework_sorter="drf" --help="false" --hostname="159.203.107.10" --hostname_lookup="true" --initialize_driver_logging="true" --ip="159.203.107.10" --log_auto_initialize="true" --log_dir="/var/log/mesos" --logbufsecs="0" --logging_level="INFO" --max_slave_ping_timeouts="5" --port="5050" --quiet="false" --quorum="1" --recovery_slave_removal_limit="100%" --registry="replicated_log" --registry_fetch_timeout="1mins" --registry_store_timeout="5secs" --registry_strict="false" --root_submissions="true" --slave_ping_timeout="15secs" --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" --webui_dir="/usr/share/mesos/webui" --work_dir="/var/lib/mesos" --zk="zk://159.203.107.10:2181,159.203.107.151:2181,159.203.107.162:2181/mesos" --zk_session_timeout="10secs" I1014 18:56:32.779762 11516 master.cpp:425] Master allowing unauthenticated frameworks to register I1014 18:56:32.779770 11516 master.cpp:430] Master allowing unauthenticated slaves to register I1014 18:56:32.779778 11516 master.cpp:467] Using default 'crammd5' authenticator W1014 18:56:32.779798 11516 authenticator.cpp:505] No credentials provided, authentication requests will be refused I1014 18:56:32.779906 11516 authenticator.cpp:512] Initializing server SASL I1014 18:56:32.791836 11515 master.cpp:1542] Successfully attached file '/var/log/mesos/mesos-master.INFO' I1014 18:56:32.792043 11519 contender.cpp:149] Joining the ZK group I1014 18:56:34.968217 11517 http.cpp:336] HTTP GET for /master/state.json from 12.228.115.34:40863 with User-Agent='Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36' I1014 18:56:45.242039 11518 http.cpp:336] HTTP GET for /master/state.json from 12.228.115.34:63018 with User-Agent='Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36' I1014 18:56:55.319259 11519 http.cpp:336] HTTP GET for /master/state.json from 12.228.115.34:50024 with User-Agent='Mozilla/5.0 (Macintosh; Intel Mac OS X 1
谢谢
第一个问题"no masters are currently leading"通常是zookeeper的问题。
检查您的服务器上 运行 是否有 zookeeper。这也可以解释您对 Marathon 和 mesos 奴隶的问题。
此文档似乎是最新的:http://open.mesosphere.com/getting-started/datacenter/install/
这是动物园管理员配置问题。 None 的教程提到除了列出服务器 ips 之外还需要在 zoo.cfg 中设置值。您还需要设置 dataDir、syncLimit、initLimit、tickTime 和 clientPort