迁移到 HDP2.2 后,Hue Beeswax / HCat 不再工作(kerberos 默认用户)
Hue Beeswax / HCat no longer working (kerberos default user) after migration to HDP2.2
我几乎完成了从我的安全 HDP2.1 到 HDP2.2 hadoop 集群的迁移。
一切似乎都有效(包括命令行中的配置单元),但色调。
如果文件浏览器、作业浏览器、pig 界面和oozie 界面都可以,那么beeswax & webhcat 界面就不行了。
(注意:他们在迁移之前工作,使用相同的 hue.ini 文件)。
我得到的错误是:
Could not start SASL: Error in sasl_client_start (-1) SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure. Minor code may provide more information (Server krbtgt/LOCALDOMAIN@HADOOP.DEV not found in Kerberos database)
似乎 thrift 正在尝试验证默认用户 krbtgt/LOCALDOMAIN
而不是配置的用户。
我试图记录在 python 文件中发生的事情,但未能看到它从哪里获取默认用户:kerberos 主体短名称是配置单元,启用了模拟。 Hue & hive 代理在 hdfs conf 文件中配置。
完整的堆栈跟踪是:
[11/May/2015 06:10:40 +0000] access INFO 172.20.43.39 alinz - "GET /beeswax/ HTTP/1.0"
[11/May/2015 06:10:40 +0000] hive_server2_lib INFO use_sasl=True, mechanism=GSSAPI, kerberos_principal_short_name=hive, impersonation_enabled=True
[11/May/2015 06:10:40 +0000] thrift_util INFO Thrift exception; retrying: Could not start SASL: Error in sasl_client_start (-1) SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure. Minor code may provide more information (Server krbtgt/LOCALDOMAIN@HADOOP.DEV not found in Kerberos database)
[11/May/2015 06:10:40 +0000] thrift_util INFO Thrift exception; retrying: Could not start SASL: Error in sasl_client_start (-1) SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure. Minor code may provide more information (Server krbtgt/LOCALDOMAIN@HADOOP.DEV not found in Kerberos database)
[11/May/2015 06:10:40 +0000] thrift_util WARNING Out of retries for thrift call: OpenSession
[11/May/2015 06:10:40 +0000] thrift_util INFO Thrift saw a transport exception: Could not start SASL: Error in sasl_client_start (-1) SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure. Minor code may provide more information (Server krbtgt/LOCALDOMAIN@HADOOP.DEV not found in Kerberos database)
[11/May/2015 06:10:40 +0000] middleware INFO Processing exception: Could not start SASL: Error in sasl_client_start (-1) SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure. Minor code may provide more information (Server krbtgt/LOCALDOMAIN@HADOOP.DEV not found in Kerberos database) (code THRIFTTRANSPORT): TTransportException('Could not start SASL: Error in sasl_client_start (-1) SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure. Minor code may provide more information (Server krbtgt/LOCALDOMAIN@HADOOP.DEV not found in Kerberos database)',): Traceback (most recent call last):
File "/usr/lib/hue/build/env/lib/python2.6/site-packages/Django-1.2.3-py2.6.egg/django/core/handlers/base.py", line 100, in get_response
response = callback(request, *callback_args, **callback_kwargs)
File "/usr/lib/hue/apps/beeswax/src/beeswax/views.py", line 69, in index
return execute_query(request)
File "/usr/lib/hue/apps/beeswax/src/beeswax/views.py", line 526, in execute_query
databases = _get_db_choices(request)
File "/usr/lib/hue/apps/beeswax/src/beeswax/views.py", line 1849, in _get_db_choices
dbs = _get_databases(request)
File "/usr/lib/hue/apps/beeswax/src/beeswax/views.py", line 1844, in _get_databases
dbs = db.get_databases()
File "/usr/lib/hue/apps/beeswax/src/beeswax/server/dbms.py", line 110, in get_databases
return self.client.get_databases()
File "/usr/lib/hue/apps/beeswax/src/beeswax/server/hive_server2_lib.py", line 746, in get_databases
return [table[col] for table in self._client.get_databases()]
File "/usr/lib/hue/apps/beeswax/src/beeswax/server/hive_server2_lib.py", line 445, in get_databases
res = self.call(self._client.GetSchemas, req)
File "/usr/lib/hue/apps/beeswax/src/beeswax/server/hive_server2_lib.py", line 408, in call
session = self.open_session(self.user)
File "/usr/lib/hue/apps/beeswax/src/beeswax/server/hive_server2_lib.py", line 382, in open_session
res = self._client.OpenSession(req)
File "/usr/lib/hue/desktop/core/src/desktop/lib/thrift_util.py", line 329, in wrapper
raise StructuredThriftTransportException(e, error_code=502)
StructuredThriftTransportException: Could not start SASL: Error in sasl_client_start (-1) SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure. Minor code may provide more information (Server krbtgt/LOCALDOMAIN@HADOOP.DEV not found in Kerberos database) (code THRIFTTRANSPORT): TTransportException('Could not start SASL: Error in sasl_client_start (-1) SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure. Minor code may provide more information (Server krbtgt/LOCALDOMAIN@HADOOP.DEV not found in Kerberos database)',)
知道哪里出了问题吗?
krb5.conf 是:
[libdefaults]
renew_lifetime = 7d
forwardable = true
default_realm = HADOOP.DEV
ticket_lifetime = 24h
dns_lookup_realm = false
dns_lookup_kdc = false
[logging]
default = FILE:/var/log/krb5kdc.log
admin_server = FILE:/var/log/kadmind.log
kdc = FILE:/var/log/krb5kdc.log
[realms]
HADOOP.DEV = {
admin_server = bt1svlmy
kdc = bt1svlmy
}
和sudo klist -e /tmp/hue_krb5_ccache
给出:
Ticket cache: FILE:/tmp/hue_krb5_ccache
Default principal: hue/bt1svlmy.bpa.bouyguestelecom.fr@HADOOP.DEV
Valid starting Expires Service principal
05/11/15 15:10:34 05/12/15 15:10:34 krbtgt/HADOOP.DEV@HADOOP.DEV
renew until 05/11/15 15:10:34, Etype (skey, tkt): aes256-cts-hmac-sha1-96, aes256-cts-hmac-sha1-96
05/11/15 15:49:52 05/12/15 15:10:34 HTTP/bt1svlmy.bpa.bouyguestelecom.fr@
renew until 05/11/15 15:10:34, Etype (skey, tkt): aes256-cts-hmac-sha1-96, aes256-cts-hmac-sha1-96
05/11/15 15:49:52 05/12/15 15:10:34 HTTP/bt1svlmy.bpa.bouyguestelecom.fr@HADOOP.DEV
renew until 05/11/15 15:10:34, Etype (skey, tkt): aes256-cts-hmac-sha1-96, aes256-cts-hmac-sha1-96
我有 krbtgt/HADOOP.DEV@HADOOP.DEV
票但没有 krbtgt/LOCALDOMAIN@HADOOP.DEV
;也许这是问题的原因?
Kerberos 日志文件是:
May 11 16:12:35 bt1svlmy krb5kdc[12636](info): TGS_REQ (4 etypes {18 17 16 23}) 172.19.115.50: UNKNOWN_SERVER: authtime 0, hue/bt1svlmy.bpa.bouyguestelecom.fr@HADOOP.DEV for hive/localhost.localdomain@HADOOP.DEV, Server not found in Kerberos database
May 11 16:12:35 bt1svlmy krb5kdc[12636](info): TGS_REQ (4 etypes {18 17 16 23}) 172.19.115.50: UNKNOWN_SERVER: authtime 0, hue/bt1svlmy.bpa.bouyguestelecom.fr@HADOOP.DEV for krbtgt/LOCALDOMAIN@HADOOP.DEV, Server not found in Kerberos database
May 11 16:12:35 bt1svlmy krb5kdc[12636](info): TGS_REQ (4 etypes {18 17 16 23}) 172.19.115.50: UNKNOWN_SERVER: authtime 0, hue/bt1svlmy.bpa.bouyguestelecom.fr@HADOOP.DEV for hive/localhost.localdomain@HADOOP.DEV, Server not found in Kerberos database
May 11 16:12:35 bt1svlmy krb5kdc[12636](info): TGS_REQ (4 etypes {18 17 16 23}) 172.19.115.50: UNKNOWN_SERVER: authtime 0, hue/bt1svlmy.bpa.bouyguestelecom.fr@HADOOP.DEV for krbtgt/LOCALDOMAIN@HADOOP.DEV, Server not found in Kerberos database
May 11 16:12:35 bt1svlmy krb5kdc[12636](info): TGS_REQ (4 etypes {18 17 16 23}) 172.19.115.50: UNKNOWN_SERVER: authtime 0, hue/bt1svlmy.bpa.bouyguestelecom.fr@HADOOP.DEV for hive/localhost.localdomain@HADOOP.DEV, Server not found in Kerberos database
May 11 16:12:35 bt1svlmy krb5kdc[12636](info): TGS_REQ (4 etypes {18 17 16 23}) 172.19.115.50: UNKNOWN_SERVER: authtime 0, hue/bt1svlmy.bpa.bouyguestelecom.fr@HADOOP.DEV for krbtgt/LOCALDOMAIN@HADOOP.DEV, Server not found in Kerberos database
在我看来,我在 conf 某处遗漏了一个默认主机名,但找不到它的文档条目。
好的,找到了(必须调试完整的 python 堆栈才能理解)。
它并没有真正宣传,但一些 hue.ini
参数名称已更改:
beeswax_server_host
--> hive_server_host
beeswax_server_port
--> hive_server_port
默认 hive_server_host
为 localhost
,这在安全集群上是不正确的。
我几乎完成了从我的安全 HDP2.1 到 HDP2.2 hadoop 集群的迁移。 一切似乎都有效(包括命令行中的配置单元),但色调。 如果文件浏览器、作业浏览器、pig 界面和oozie 界面都可以,那么beeswax & webhcat 界面就不行了。 (注意:他们在迁移之前工作,使用相同的 hue.ini 文件)。
我得到的错误是:
Could not start SASL: Error in sasl_client_start (-1) SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure. Minor code may provide more information (Server krbtgt/LOCALDOMAIN@HADOOP.DEV not found in Kerberos database)
似乎 thrift 正在尝试验证默认用户 krbtgt/LOCALDOMAIN
而不是配置的用户。
我试图记录在 python 文件中发生的事情,但未能看到它从哪里获取默认用户:kerberos 主体短名称是配置单元,启用了模拟。 Hue & hive 代理在 hdfs conf 文件中配置。
完整的堆栈跟踪是:
[11/May/2015 06:10:40 +0000] access INFO 172.20.43.39 alinz - "GET /beeswax/ HTTP/1.0" [11/May/2015 06:10:40 +0000] hive_server2_lib INFO use_sasl=True, mechanism=GSSAPI, kerberos_principal_short_name=hive, impersonation_enabled=True [11/May/2015 06:10:40 +0000] thrift_util INFO Thrift exception; retrying: Could not start SASL: Error in sasl_client_start (-1) SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure. Minor code may provide more information (Server krbtgt/LOCALDOMAIN@HADOOP.DEV not found in Kerberos database) [11/May/2015 06:10:40 +0000] thrift_util INFO Thrift exception; retrying: Could not start SASL: Error in sasl_client_start (-1) SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure. Minor code may provide more information (Server krbtgt/LOCALDOMAIN@HADOOP.DEV not found in Kerberos database) [11/May/2015 06:10:40 +0000] thrift_util WARNING Out of retries for thrift call: OpenSession [11/May/2015 06:10:40 +0000] thrift_util INFO Thrift saw a transport exception: Could not start SASL: Error in sasl_client_start (-1) SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure. Minor code may provide more information (Server krbtgt/LOCALDOMAIN@HADOOP.DEV not found in Kerberos database) [11/May/2015 06:10:40 +0000] middleware INFO Processing exception: Could not start SASL: Error in sasl_client_start (-1) SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure. Minor code may provide more information (Server krbtgt/LOCALDOMAIN@HADOOP.DEV not found in Kerberos database) (code THRIFTTRANSPORT): TTransportException('Could not start SASL: Error in sasl_client_start (-1) SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure. Minor code may provide more information (Server krbtgt/LOCALDOMAIN@HADOOP.DEV not found in Kerberos database)',): Traceback (most recent call last): File "/usr/lib/hue/build/env/lib/python2.6/site-packages/Django-1.2.3-py2.6.egg/django/core/handlers/base.py", line 100, in get_response response = callback(request, *callback_args, **callback_kwargs) File "/usr/lib/hue/apps/beeswax/src/beeswax/views.py", line 69, in index return execute_query(request) File "/usr/lib/hue/apps/beeswax/src/beeswax/views.py", line 526, in execute_query databases = _get_db_choices(request) File "/usr/lib/hue/apps/beeswax/src/beeswax/views.py", line 1849, in _get_db_choices dbs = _get_databases(request) File "/usr/lib/hue/apps/beeswax/src/beeswax/views.py", line 1844, in _get_databases dbs = db.get_databases() File "/usr/lib/hue/apps/beeswax/src/beeswax/server/dbms.py", line 110, in get_databases return self.client.get_databases() File "/usr/lib/hue/apps/beeswax/src/beeswax/server/hive_server2_lib.py", line 746, in get_databases return [table[col] for table in self._client.get_databases()] File "/usr/lib/hue/apps/beeswax/src/beeswax/server/hive_server2_lib.py", line 445, in get_databases res = self.call(self._client.GetSchemas, req) File "/usr/lib/hue/apps/beeswax/src/beeswax/server/hive_server2_lib.py", line 408, in call session = self.open_session(self.user) File "/usr/lib/hue/apps/beeswax/src/beeswax/server/hive_server2_lib.py", line 382, in open_session res = self._client.OpenSession(req) File "/usr/lib/hue/desktop/core/src/desktop/lib/thrift_util.py", line 329, in wrapper raise StructuredThriftTransportException(e, error_code=502) StructuredThriftTransportException: Could not start SASL: Error in sasl_client_start (-1) SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure. Minor code may provide more information (Server krbtgt/LOCALDOMAIN@HADOOP.DEV not found in Kerberos database) (code THRIFTTRANSPORT): TTransportException('Could not start SASL: Error in sasl_client_start (-1) SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure. Minor code may provide more information (Server krbtgt/LOCALDOMAIN@HADOOP.DEV not found in Kerberos database)',)
知道哪里出了问题吗?
krb5.conf 是:
[libdefaults] renew_lifetime = 7d forwardable = true default_realm = HADOOP.DEV ticket_lifetime = 24h dns_lookup_realm = false dns_lookup_kdc = false [logging] default = FILE:/var/log/krb5kdc.log admin_server = FILE:/var/log/kadmind.log kdc = FILE:/var/log/krb5kdc.log [realms] HADOOP.DEV = { admin_server = bt1svlmy kdc = bt1svlmy }
和sudo klist -e /tmp/hue_krb5_ccache
给出:
Ticket cache: FILE:/tmp/hue_krb5_ccache Default principal: hue/bt1svlmy.bpa.bouyguestelecom.fr@HADOOP.DEV Valid starting Expires Service principal 05/11/15 15:10:34 05/12/15 15:10:34 krbtgt/HADOOP.DEV@HADOOP.DEV renew until 05/11/15 15:10:34, Etype (skey, tkt): aes256-cts-hmac-sha1-96, aes256-cts-hmac-sha1-96 05/11/15 15:49:52 05/12/15 15:10:34 HTTP/bt1svlmy.bpa.bouyguestelecom.fr@ renew until 05/11/15 15:10:34, Etype (skey, tkt): aes256-cts-hmac-sha1-96, aes256-cts-hmac-sha1-96 05/11/15 15:49:52 05/12/15 15:10:34 HTTP/bt1svlmy.bpa.bouyguestelecom.fr@HADOOP.DEV renew until 05/11/15 15:10:34, Etype (skey, tkt): aes256-cts-hmac-sha1-96, aes256-cts-hmac-sha1-96
我有 krbtgt/HADOOP.DEV@HADOOP.DEV
票但没有 krbtgt/LOCALDOMAIN@HADOOP.DEV
;也许这是问题的原因?
Kerberos 日志文件是:
May 11 16:12:35 bt1svlmy krb5kdc[12636](info): TGS_REQ (4 etypes {18 17 16 23}) 172.19.115.50: UNKNOWN_SERVER: authtime 0, hue/bt1svlmy.bpa.bouyguestelecom.fr@HADOOP.DEV for hive/localhost.localdomain@HADOOP.DEV, Server not found in Kerberos database May 11 16:12:35 bt1svlmy krb5kdc[12636](info): TGS_REQ (4 etypes {18 17 16 23}) 172.19.115.50: UNKNOWN_SERVER: authtime 0, hue/bt1svlmy.bpa.bouyguestelecom.fr@HADOOP.DEV for krbtgt/LOCALDOMAIN@HADOOP.DEV, Server not found in Kerberos database May 11 16:12:35 bt1svlmy krb5kdc[12636](info): TGS_REQ (4 etypes {18 17 16 23}) 172.19.115.50: UNKNOWN_SERVER: authtime 0, hue/bt1svlmy.bpa.bouyguestelecom.fr@HADOOP.DEV for hive/localhost.localdomain@HADOOP.DEV, Server not found in Kerberos database May 11 16:12:35 bt1svlmy krb5kdc[12636](info): TGS_REQ (4 etypes {18 17 16 23}) 172.19.115.50: UNKNOWN_SERVER: authtime 0, hue/bt1svlmy.bpa.bouyguestelecom.fr@HADOOP.DEV for krbtgt/LOCALDOMAIN@HADOOP.DEV, Server not found in Kerberos database May 11 16:12:35 bt1svlmy krb5kdc[12636](info): TGS_REQ (4 etypes {18 17 16 23}) 172.19.115.50: UNKNOWN_SERVER: authtime 0, hue/bt1svlmy.bpa.bouyguestelecom.fr@HADOOP.DEV for hive/localhost.localdomain@HADOOP.DEV, Server not found in Kerberos database May 11 16:12:35 bt1svlmy krb5kdc[12636](info): TGS_REQ (4 etypes {18 17 16 23}) 172.19.115.50: UNKNOWN_SERVER: authtime 0, hue/bt1svlmy.bpa.bouyguestelecom.fr@HADOOP.DEV for krbtgt/LOCALDOMAIN@HADOOP.DEV, Server not found in Kerberos database
在我看来,我在 conf 某处遗漏了一个默认主机名,但找不到它的文档条目。
好的,找到了(必须调试完整的 python 堆栈才能理解)。
它并没有真正宣传,但一些 hue.ini
参数名称已更改:
beeswax_server_host
-->hive_server_host
beeswax_server_port
-->hive_server_port
默认 hive_server_host
为 localhost
,这在安全集群上是不正确的。