Ambari 确认主机步骤失败:向服务器注册失败

Ambari Confirm hosts Step fails: Registration with the server failed

我想用 Ambari 构建一个平台来测试 spark 上的一些功能。 我使用 Win 10+Hyper-V 创建了两个安装了 CentOS 7 的虚拟机(mercury.gc 和 venus.gc)。 Ambari 2.2.2.0 安装在一个 VM (mercury.gc) 上,并尝试使用它来配置这两个 VM。当运行 确认主机时,returns 的进程都失败了。以下是一台机器的日志:

Connection to mercury.gc closed.
SSH command execution finished
host=mercury.gc, exitcode=0
Command end time 2016-07-18 00:29:58

Registering with the server...
Registration with the server failed.

我已经检查过无密码ssh登录是否正常,防火墙和selinux已经关闭。我无法弄清楚日志中发生了什么。有没有人可以帮我解决这个问题?

ambari给出的完整日志如下:

==========================
Creating target directory...
==========================

Command start time 2016-07-18 00:29:51

Connection to mercury.gc closed.
SSH command execution finished
host=mercury.gc, exitcode=0
Command end time 2016-07-18 00:29:51

==========================
Copying common functions script...
==========================

Command start time 2016-07-18 00:29:51

scp /usr/lib/python2.6/site-packages/ambari_commons
host=mercury.gc, exitcode=0
Command end time 2016-07-18 00:29:52

==========================
Copying OS type check script...
==========================

Command start time 2016-07-18 00:29:52

scp /usr/lib/python2.6/site-packages/ambari_server/os_check_type.py
host=mercury.gc, exitcode=0
Command end time 2016-07-18 00:29:52

==========================
Running OS type check...
==========================

Command start time 2016-07-18 00:29:52
Cluster primary/cluster OS family is redhat7 and local/current OS family is redhat7

Connection to mercury.gc closed.
SSH command execution finished
host=mercury.gc, exitcode=0
Command end time 2016-07-18 00:29:52

==========================
Checking 'sudo' package on remote host...
==========================

Command start time 2016-07-18 00:29:52
sudo-1.8.6p7-17.el7_2.x86_64

Connection to mercury.gc closed.
SSH command execution finished
host=mercury.gc, exitcode=0
Command end time 2016-07-18 00:29:53

==========================
Copying repo file to 'tmp' folder...
==========================

Command start time 2016-07-18 00:29:53

scp /etc/yum.repos.d/ambari.repo
host=mercury.gc, exitcode=0
Command end time 2016-07-18 00:29:53

==========================
Moving file to repo dir...
==========================

Command start time 2016-07-18 00:29:53

Connection to mercury.gc closed.
SSH command execution finished
host=mercury.gc, exitcode=0
Command end time 2016-07-18 00:29:53

==========================
Changing permissions for ambari.repo...
==========================

Command start time 2016-07-18 00:29:53

Connection to mercury.gc closed.
SSH command execution finished
host=mercury.gc, exitcode=0
Command end time 2016-07-18 00:29:54

==========================
Copying setup script file...
==========================

Command start time 2016-07-18 00:29:54

scp /usr/lib/python2.6/site-packages/ambari_server/setupAgent.py
host=mercury.gc, exitcode=0
Command end time 2016-07-18 00:29:54

==========================
Running setup agent script...
==========================

Command start time 2016-07-18 00:29:54
("INFO 2016-07-18 00:16:30,697 threadpool.py:52 - Started thread pool with 3 core threads and 20 maximum threads
WARNING 2016-07-18 00:16:30,697 AlertSchedulerHandler.py:246 - [AlertScheduler] /var/lib/ambari-agent/cache/alerts/definitions.json not found or invalid. No alerts will be scheduled until registration occurs.
INFO 2016-07-18 00:16:30,697 AlertSchedulerHandler.py:142 - [AlertScheduler] Starting <ambari_agent.apscheduler.scheduler.Scheduler object at 0x1de0ad0>; currently running: False
INFO 2016-07-18 00:16:32,701 hostname.py:89 - Read public hostname 'mercury.gc' using socket.getfqdn()
INFO 2016-07-18 00:16:32,805 ExitHelper.py:53 - Performing cleanup before exiting...
INFO 2016-07-18 00:16:32,805 ExitHelper.py:67 - Cleanup finished, exiting with code:0
INFO 2016-07-18 00:29:55,955 main.py:74 - loglevel=logging.INFO
INFO 2016-07-18 00:29:55,957 main.py:74 - loglevel=logging.INFO
INFO 2016-07-18 00:29:55,958 DataCleaner.py:39 - Data cleanup thread started
INFO 2016-07-18 00:29:55,961 DataCleaner.py:120 - Data cleanup started
INFO 2016-07-18 00:29:55,961 DataCleaner.py:122 - Data cleanup finished
INFO 2016-07-18 00:29:56,012 PingPortListener.py:50 - Ping port listener started on port: 8670
INFO 2016-07-18 00:29:56,013 main.py:289 - Connecting to Ambari server at https://mercury.gc:8440 (192.168.137.100)
INFO 2016-07-18 00:29:56,013 NetUtil.py:60 - Connecting to https://mercury.gc:8440/ca
INFO 2016-07-18 00:29:56,099 threadpool.py:52 - Started thread pool with 3 core threads and 20 maximum threads
WARNING 2016-07-18 00:29:56,100 AlertSchedulerHandler.py:246 - [AlertScheduler] /var/lib/ambari-agent/cache/alerts/definitions.json not found or invalid. No alerts will be scheduled until registration occurs.
INFO 2016-07-18 00:29:56,100 AlertSchedulerHandler.py:142 - [AlertScheduler] Starting <ambari_agent.apscheduler.scheduler.Scheduler object at 0x10e6ad0>; currently running: False
INFO 2016-07-18 00:29:58,103 hostname.py:89 - Read public hostname 'mercury.gc' using socket.getfqdn()
INFO 2016-07-18 00:29:58,207 ExitHelper.py:53 - Performing cleanup before exiting...
INFO 2016-07-18 00:29:58,207 ExitHelper.py:67 - Cleanup finished, exiting with code:0
", None)
("INFO 2016-07-18 00:16:30,697 threadpool.py:52 - Started thread pool with 3 core threads and 20 maximum threads
WARNING 2016-07-18 00:16:30,697 AlertSchedulerHandler.py:246 - [AlertScheduler] /var/lib/ambari-agent/cache/alerts/definitions.json not found or invalid. No alerts will be scheduled until registration occurs.
INFO 2016-07-18 00:16:30,697 AlertSchedulerHandler.py:142 - [AlertScheduler] Starting <ambari_agent.apscheduler.scheduler.Scheduler object at 0x1de0ad0>; currently running: False
INFO 2016-07-18 00:16:32,701 hostname.py:89 - Read public hostname 'mercury.gc' using socket.getfqdn()
INFO 2016-07-18 00:16:32,805 ExitHelper.py:53 - Performing cleanup before exiting...
INFO 2016-07-18 00:16:32,805 ExitHelper.py:67 - Cleanup finished, exiting with code:0
INFO 2016-07-18 00:29:55,955 main.py:74 - loglevel=logging.INFO
INFO 2016-07-18 00:29:55,957 main.py:74 - loglevel=logging.INFO
INFO 2016-07-18 00:29:55,958 DataCleaner.py:39 - Data cleanup thread started
INFO 2016-07-18 00:29:55,961 DataCleaner.py:120 - Data cleanup started
INFO 2016-07-18 00:29:55,961 DataCleaner.py:122 - Data cleanup finished
INFO 2016-07-18 00:29:56,012 PingPortListener.py:50 - Ping port listener started on port: 8670
INFO 2016-07-18 00:29:56,013 main.py:289 - Connecting to Ambari server at https://mercury.gc:8440 (192.168.137.100)
INFO 2016-07-18 00:29:56,013 NetUtil.py:60 - Connecting to https://mercury.gc:8440/ca
INFO 2016-07-18 00:29:56,099 threadpool.py:52 - Started thread pool with 3 core threads and 20 maximum threads
WARNING 2016-07-18 00:29:56,100 AlertSchedulerHandler.py:246 - [AlertScheduler] /var/lib/ambari-agent/cache/alerts/definitions.json not found or invalid. No alerts will be scheduled until registration occurs.
INFO 2016-07-18 00:29:56,100 AlertSchedulerHandler.py:142 - [AlertScheduler] Starting <ambari_agent.apscheduler.scheduler.Scheduler object at 0x10e6ad0>; currently running: False
INFO 2016-07-18 00:29:58,103 hostname.py:89 - Read public hostname 'mercury.gc' using socket.getfqdn()
INFO 2016-07-18 00:29:58,207 ExitHelper.py:53 - Performing cleanup before exiting...
INFO 2016-07-18 00:29:58,207 ExitHelper.py:67 - Cleanup finished, exiting with code:0
", None)

Connection to mercury.gc closed.
SSH command execution finished
host=mercury.gc, exitcode=0
Command end time 2016-07-18 00:29:58

Registering with the server...
Registration with the server failed.

更新: 以下是我的 ambari-agent.log 来自 mercury.gc

INFO 2016-07-19 20:35:33,732 main.py:74 - loglevel=logging.INFO
INFO 2016-07-19 20:35:33,732 main.py:74 - loglevel=logging.INFO
INFO 2016-07-19 20:35:33,733 DataCleaner.py:39 - Data cleanup thread started
INFO 2016-07-19 20:35:33,734 DataCleaner.py:120 - Data cleanup started
INFO 2016-07-19 20:35:33,734 DataCleaner.py:122 - Data cleanup finished
INFO 2016-07-19 20:35:33,775 PingPortListener.py:50 - Ping port listener started on port: 8670
INFO 2016-07-19 20:35:33,776 main.py:289 - Connecting to Ambari server at https://mercury.gc:8440 (192.168.137.100)
INFO 2016-07-19 20:35:33,776 NetUtil.py:60 - Connecting to https://mercury.gc:8440/ca
INFO 2016-07-19 20:35:33,870 threadpool.py:52 - Started thread pool with 3 core threads and 20 maximum threads
WARNING 2016-07-19 20:35:33,870 AlertSchedulerHandler.py:246 - [AlertScheduler] /var/lib/ambari-agent/cache/alerts/definitions.json not found or invalid. No alerts will be scheduled until registration occurs.
INFO 2016-07-19 20:35:33,870 AlertSchedulerHandler.py:142 - [AlertScheduler] Starting <ambari_agent.apscheduler.scheduler.Scheduler object at 0x1a0aad0>; currently running: False
INFO 2016-07-19 20:35:35,874 hostname.py:89 - Read public hostname 'mercury.gc' using socket.getfqdn()
INFO 2016-07-19 20:35:35,999 ExitHelper.py:53 - Performing cleanup before exiting...
INFO 2016-07-19 20:35:36,000 ExitHelper.py:67 - Cleanup finished, exiting with code:0

下面是两台主机上的 /etc/hosts

127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

192.168.137.100 mercury.gc mercury
192.168.137.101 venus.gc venus

这是我在/etc/ambari-server/conf/

中的ambari.properties
jdk1.7.dest-file=jdk-7u67-linux-x64.tar.gz
kerberos.keytab.cache.dir=/var/lib/ambari-server/data/cache
views.request.read.timeout.millis=10000
agent.package.install.task.timeout=1800
server.connection.max.idle.millis=900000
bootstrap.script=/usr/lib/python2.6/site-packages/ambari_server/bootstrap.py
server.version.file=/var/lib/ambari-server/resources/version
recovery.type=AUTO_START
api.authenticate=true
http.strict-transport-security=max-age=31536000
server.persistence.type=local
jdk1.8.jcpol-url=http://public-repo-1.hortonworks.com/ARTIFACTS/jce_policy-8.zip
jdk1.8.dest-file=jdk-8u60-linux-x64.tar.gz
rolling.upgrade.skip.packages.prefixes=
common.services.path=/var/lib/ambari-server/resources/common-services
http.x-frame-options=DENY
server.task.timeout=1200
jce.download.supported=true
agent.threadpool.size.max=25
recovery.lifetime_max_count=1024
jdk1.8.re=(jdk.*)/jre
ambari.python.wrap=ambari-python-wrap
ambari-server.user=root
agent.task.timeout=900
jdk1.7.url=http://public-repo-1.hortonworks.com/ARTIFACTS/jdk-7u67-linux-x64.tar.gz
server.jdbc.user.name=ambari
server.os_family=redhat7
java.home=/usr/java/jdk1.8.0_92/
server.jdbc.postgres.schema=ambari
jdk.name=jdk-8u60-linux-x64.tar.gz
user.inactivity.timeout.default=0
java.releases=jdk1.8,jdk1.7
skip.service.checks=false
shared.resources.dir=/usr/lib/ambari-server/lib/ambari_commons/resources
jdk.download.supported=true
recommendations.dir=/var/run/ambari-server/stack-recommendations
ulimit.open.files=10000
agent.stack.retry.tries=5

rolling.upgrade.min.stack=HDP-2.2
jdk1.8.desc=Oracle JDK 1.8 + Java Cryptography Extension (JCE) Policy Files 8
server.os_type=centos7
views.http.strict-transport-security=max-age=31536000
views.ambari.request.connect.timeout.millis=5000
views.request.connect.timeout.millis=5000
resources.dir=/var/lib/ambari-server/resources
custom.action.definitions=/var/lib/ambari-server/resources/custom_action_definitions
views.http.x-frame-options=SAMEORIGIN
recovery.enabled_components=METRICS_COLLECTOR
jdk1.7.re=(jdk.*)/jre
server.execution.scheduler.maxDbConnections=5
jdk1.7.desc=Oracle JDK 1.7 + Java Cryptography Extension (JCE) Policy Files 7
agent.stack.retry.on_repo_unavailability=false
views.ambari.request.read.timeout.millis=10000
jdk1.8.jcpol-file=jce_policy-8.zip
rolling.upgrade.max.stack=
server.http.session.inactive_timeout=1800
jdk1.7.jcpol-file=UnlimitedJCEPolicyJDK7.zip
server.execution.scheduler.misfire.toleration.minutes=480
security.server.keys_dir=/var/lib/ambari-server/keys
stackadvisor.script=/var/lib/ambari-server/resources/scripts/stack_advisor.py
server.tmp.dir=/var/lib/ambari-server/data/tmp
server.execution.scheduler.maxThreads=5
metadata.path=/var/lib/ambari-server/resources/stacks
server.fqdn.service.url=http://169.254.169.254/latest/meta-data/public-hostname
views.http.x-xss-protection=1; mode=block
webapp.dir=/usr/lib/ambari-server/web
bootstrap.dir=/var/run/ambari-server/bootstrap
#jdk1.7.home=/usr/jdk64/
jdk1.7.home=/usr/java/jdk1.8.0_92/
jdk1.8.url=http://public-repo-1.hortonworks.com/ARTIFACTS/jdk-8u60-linux-x64.tar.gz
#jdk1.8.home=/usr/jdk64/
jdk1.8.home=/usr/java/jdk1.8.0_92/
user.inactivity.timeout.role.readonly.default=0
http.x-xss-protection=1; mode=block
jce.name=jce_policy-8.zip
client.threadpool.size.max=25
jdk1.7.jcpol-url=http://public-repo-1.hortonworks.com/ARTIFACTS/UnlimitedJCEPolicyJDK7.zip

server.jdbc.user.passwd=/etc/ambari-server/conf/password.dat
server.execution.scheduler.isClustered=false
server.stages.parallel=true
bootstrap.setup_agent.script=/usr/lib/python2.6/site-packages/ambari_server/setupAgent.py
server.jdbc.database=postgres
server.jdbc.database_name=ambari

我已经根据https://community.hortonworks.com/questions/23409/there-is-a-problem-when-install-hdp-on-the-stepcon.html

中的讨论找到并解决了问题

我认为这是因为我在安装CentOs 7时设置了非英文语言(即繁体中文)作为默认语言。它会遇到字符集问题(UTF-8<->ascii)确认主机时。把默认语言改成英文后,这个问题就解决了