迁移到 HDP2.2 后,Hue Beeswax / HCat 不再工作(kerberos 默认用户)

Hue Beeswax / HCat no longer working (kerberos default user) after migration to HDP2.2

我几乎完成了从我的安全 HDP2.1 到 HDP2.2 hadoop 集群的迁移。 一切似乎都有效(包括命令行中的配置单元),但色调。 如果文件浏览器、作业浏览器、pig 界面和oozie 界面都可以,那么beeswax & webhcat 界面就不行了。 (注意:他们在迁移之前工作,使用相同的 hue.ini 文件)。

我得到的错误是: Could not start SASL: Error in sasl_client_start (-1) SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure. Minor code may provide more information (Server krbtgt/LOCALDOMAIN@HADOOP.DEV not found in Kerberos database)

似乎 thrift 正在尝试验证默认用户 krbtgt/LOCALDOMAIN 而不是配置的用户。

我试图记录在 python 文件中发生的事情,但未能看到它从哪里获取默认用户:kerberos 主体短名称是配置单元,启用了模拟。 Hue & hive 代理在 hdfs conf 文件中配置。

完整的堆栈跟踪是:

[11/May/2015 06:10:40 +0000] access       INFO     172.20.43.39 alinz - "GET /beeswax/ HTTP/1.0"
[11/May/2015 06:10:40 +0000] hive_server2_lib INFO     use_sasl=True, mechanism=GSSAPI, kerberos_principal_short_name=hive, impersonation_enabled=True
[11/May/2015 06:10:40 +0000] thrift_util  INFO     Thrift exception; retrying: Could not start SASL: Error in sasl_client_start (-1) SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure.  Minor code may provide more information (Server krbtgt/LOCALDOMAIN@HADOOP.DEV not found in Kerberos database)
[11/May/2015 06:10:40 +0000] thrift_util  INFO     Thrift exception; retrying: Could not start SASL: Error in sasl_client_start (-1) SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure.  Minor code may provide more information (Server krbtgt/LOCALDOMAIN@HADOOP.DEV not found in Kerberos database)
[11/May/2015 06:10:40 +0000] thrift_util  WARNING  Out of retries for thrift call: OpenSession
[11/May/2015 06:10:40 +0000] thrift_util  INFO     Thrift saw a transport exception: Could not start SASL: Error in sasl_client_start (-1) SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure.  Minor code may provide more information (Server krbtgt/LOCALDOMAIN@HADOOP.DEV not found in Kerberos database)
[11/May/2015 06:10:40 +0000] middleware   INFO     Processing exception: Could not start SASL: Error in sasl_client_start (-1) SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure.  Minor code may provide more information (Server krbtgt/LOCALDOMAIN@HADOOP.DEV not found in Kerberos database) (code THRIFTTRANSPORT): TTransportException('Could not start SASL: Error in sasl_client_start (-1) SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure.  Minor code may provide more information (Server krbtgt/LOCALDOMAIN@HADOOP.DEV not found in Kerberos database)',): Traceback (most recent call last):
  File "/usr/lib/hue/build/env/lib/python2.6/site-packages/Django-1.2.3-py2.6.egg/django/core/handlers/base.py", line 100, in get_response
    response = callback(request, *callback_args, **callback_kwargs)
  File "/usr/lib/hue/apps/beeswax/src/beeswax/views.py", line 69, in index
    return execute_query(request)
  File "/usr/lib/hue/apps/beeswax/src/beeswax/views.py", line 526, in execute_query
    databases = _get_db_choices(request)
  File "/usr/lib/hue/apps/beeswax/src/beeswax/views.py", line 1849, in _get_db_choices
    dbs = _get_databases(request)
  File "/usr/lib/hue/apps/beeswax/src/beeswax/views.py", line 1844, in _get_databases
    dbs = db.get_databases()
  File "/usr/lib/hue/apps/beeswax/src/beeswax/server/dbms.py", line 110, in get_databases
    return self.client.get_databases()
  File "/usr/lib/hue/apps/beeswax/src/beeswax/server/hive_server2_lib.py", line 746, in get_databases
    return [table[col] for table in self._client.get_databases()]
  File "/usr/lib/hue/apps/beeswax/src/beeswax/server/hive_server2_lib.py", line 445, in get_databases
    res = self.call(self._client.GetSchemas, req)
  File "/usr/lib/hue/apps/beeswax/src/beeswax/server/hive_server2_lib.py", line 408, in call
    session = self.open_session(self.user)
  File "/usr/lib/hue/apps/beeswax/src/beeswax/server/hive_server2_lib.py", line 382, in open_session
    res = self._client.OpenSession(req)
  File "/usr/lib/hue/desktop/core/src/desktop/lib/thrift_util.py", line 329, in wrapper
    raise StructuredThriftTransportException(e, error_code=502)
StructuredThriftTransportException: Could not start SASL: Error in sasl_client_start (-1) SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure.  Minor code may provide more information (Server krbtgt/LOCALDOMAIN@HADOOP.DEV not found in Kerberos database) (code THRIFTTRANSPORT): TTransportException('Could not start SASL: Error in sasl_client_start (-1) SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure.  Minor code may provide more information (Server krbtgt/LOCALDOMAIN@HADOOP.DEV not found in Kerberos database)',)

知道哪里出了问题吗?

krb5.conf 是:


    [libdefaults]
      renew_lifetime = 7d
      forwardable = true
      default_realm = HADOOP.DEV
      ticket_lifetime = 24h
      dns_lookup_realm = false
      dns_lookup_kdc = false
    [logging]
      default = FILE:/var/log/krb5kdc.log
      admin_server = FILE:/var/log/kadmind.log
      kdc = FILE:/var/log/krb5kdc.log
    [realms]
      HADOOP.DEV = {
        admin_server = bt1svlmy
        kdc = bt1svlmy
      }

sudo klist -e /tmp/hue_krb5_ccache给出:

Ticket cache: FILE:/tmp/hue_krb5_ccache
Default principal: hue/bt1svlmy.bpa.bouyguestelecom.fr@HADOOP.DEV

Valid starting     Expires            Service principal
05/11/15 15:10:34  05/12/15 15:10:34  krbtgt/HADOOP.DEV@HADOOP.DEV
        renew until 05/11/15 15:10:34, Etype (skey, tkt): aes256-cts-hmac-sha1-96, aes256-cts-hmac-sha1-96
05/11/15 15:49:52  05/12/15 15:10:34  HTTP/bt1svlmy.bpa.bouyguestelecom.fr@
        renew until 05/11/15 15:10:34, Etype (skey, tkt): aes256-cts-hmac-sha1-96, aes256-cts-hmac-sha1-96
05/11/15 15:49:52  05/12/15 15:10:34  HTTP/bt1svlmy.bpa.bouyguestelecom.fr@HADOOP.DEV
        renew until 05/11/15 15:10:34, Etype (skey, tkt): aes256-cts-hmac-sha1-96, aes256-cts-hmac-sha1-96

我有 krbtgt/HADOOP.DEV@HADOOP.DEV 票但没有 krbtgt/LOCALDOMAIN@HADOOP.DEV ;也许这是问题的原因?

Kerberos 日志文件是:

May 11 16:12:35 bt1svlmy krb5kdc[12636](info): TGS_REQ (4 etypes {18 17 16 23}) 172.19.115.50: UNKNOWN_SERVER: authtime 0,  hue/bt1svlmy.bpa.bouyguestelecom.fr@HADOOP.DEV for hive/localhost.localdomain@HADOOP.DEV, Server not found in Kerberos database
May 11 16:12:35 bt1svlmy krb5kdc[12636](info): TGS_REQ (4 etypes {18 17 16 23}) 172.19.115.50: UNKNOWN_SERVER: authtime 0,  hue/bt1svlmy.bpa.bouyguestelecom.fr@HADOOP.DEV for krbtgt/LOCALDOMAIN@HADOOP.DEV, Server not found in Kerberos database
May 11 16:12:35 bt1svlmy krb5kdc[12636](info): TGS_REQ (4 etypes {18 17 16 23}) 172.19.115.50: UNKNOWN_SERVER: authtime 0,  hue/bt1svlmy.bpa.bouyguestelecom.fr@HADOOP.DEV for hive/localhost.localdomain@HADOOP.DEV, Server not found in Kerberos database
May 11 16:12:35 bt1svlmy krb5kdc[12636](info): TGS_REQ (4 etypes {18 17 16 23}) 172.19.115.50: UNKNOWN_SERVER: authtime 0,  hue/bt1svlmy.bpa.bouyguestelecom.fr@HADOOP.DEV for krbtgt/LOCALDOMAIN@HADOOP.DEV, Server not found in Kerberos database
May 11 16:12:35 bt1svlmy krb5kdc[12636](info): TGS_REQ (4 etypes {18 17 16 23}) 172.19.115.50: UNKNOWN_SERVER: authtime 0,  hue/bt1svlmy.bpa.bouyguestelecom.fr@HADOOP.DEV for hive/localhost.localdomain@HADOOP.DEV, Server not found in Kerberos database
May 11 16:12:35 bt1svlmy krb5kdc[12636](info): TGS_REQ (4 etypes {18 17 16 23}) 172.19.115.50: UNKNOWN_SERVER: authtime 0,  hue/bt1svlmy.bpa.bouyguestelecom.fr@HADOOP.DEV for krbtgt/LOCALDOMAIN@HADOOP.DEV, Server not found in Kerberos database

在我看来,我在 conf 某处遗漏了一个默认主机名,但找不到它的文档条目。

好的,找到了(必须调试完整的 python 堆栈才能理解)。 它并没有真正宣传,但一些 hue.ini 参数名称已更改:

  • beeswax_server_host --> hive_server_host
  • beeswax_server_port --> hive_server_port

默认 hive_server_hostlocalhost,这在安全集群上是不正确的。