Cloudera/CDH v6.1.x + Python HappyBase v1.1.0: TTransportException(type=4, message='TSocket read 0 bytes')

Cloudera/CDH v6.1.x + Python HappyBase v1.1.0: TTransportException(type=4, message='TSocket read 0 bytes')

EDIT: This question and answer applies to anyone who is experiencing the exception stated in the subject line: TTransportException(type=4, message='TSocket read 0 bytes'); whether or not Cloudera and/or HappyBase is involved.

The root issue (as it turned out) stems from mismatching protocol and/or transport formats on the client-side with what the server-side is implementing, and this can happen with any client/server paring. Mine just happened to be Cloudera and HappyBase, but yours needn't be and you can run into this same issue.

最近有没有人尝试使用 happybase v1.1.0 (latest) Python 软件包与 Cloudera CDH v6.1.x 上的 Hbase 互动?

我正在尝试各种选项,但总是出现异常:

thriftpy.transport.TTransportException:
TTransportException(type=4, message='TSocket read 0 bytes')

以下是我如何启动会话并提交一个简单的调用以获取表列表(使用 Python v3.6.7:

import happybase

CDH6_HBASE_THRIFT_VER='0.92'

hbase_cnxn = happybase.Connection(
    host='vps00', port=9090,
    table_prefix=None,
    compat=CDH6_HBASE_THRIFT_VER,
    table_prefix_separator=b'_',
    timeout=None,
    autoconnect=True,
    transport='buffered',
    protocol='binary'
)

print('tables:', hbase_cnxn.tables()) # Exception happens here.

下面是 Cloudera CDH v6.1.x 启动 Hbase Thrift 服务器的方式(为简洁起见被截断):

/usr/java/jdk1.8.0_141-cloudera/bin/java [... snip ... ] \
    org.apache.hadoop.hbase.thrift.ThriftServer start \
    --port 9090 -threadpool --bind 0.0.0.0 --framed --compact

我已经尝试了多种选项,但一无所获。

有人用过这个吗?

编辑: 我接下来编译 Hbase.thrift(来自 Hbase 源文件——与 CDH v6.1.x 使用的 HBase 版本相同)并使用 Python thrift 绑定包(换句话说,我从等式中删除了 happybase)并得到了相同的异常。

(._.);

谢谢!

经过一天的努力,我的问题的答案如下:

import happybase

CDH6_HBASE_THRIFT_VER='0.92'

hbase_cnxn = happybase.Connection(
    host='vps00', port=9090,
    table_prefix=None,
    compat=CDH6_HBASE_THRIFT_VER,
    table_prefix_separator=b'_',
    timeout=None,
    autoconnect=True,
    transport='framed',  # Default: 'buffered'  <---- Changed.
    protocol='compact'   # Default: 'binary'    <---- Changed.
)

print('tables:', hbase_cnxn.tables()) # Works. Output: [b'ns1:mytable', ]

请注意,虽然此问答是在 Cloudera 的上下文中构建的,但事实证明(如您所见)这是 Thrift 版本和 Thrift 服务器端配置相关,因此它也适用于 HortonworksMapR 用户。

解释:

Cloudera CDH v6.1.x(也可能是未来的版本)上,如果您访问其管理 Hbase Thrift Server Configuration 部分 U.I,您会发现 -- 在许多其他设置中 -- - 这些:


注意 compact protocolframed transport 都已启用;因此它们相应地需要在 happybase 中从其默认值(我在上面显示)进行更改。

正如在 EDIT 对我最初的问题的跟进中提到的,我还调查了一个纯粹的 Thrift(非 happybase)解决方案。并且针对该案例对 Python 代码进行了类似的更改,我也让它工作了。这是您应该用于纯 Thrift 解决方案的代码(请注意阅读我在下面的评论注释):

from thrift.protocol import TCompactProtocol             # Notice the import: TCompactProtocol [!]
from thrift.transport.TTransport import TFramedTransport # Notice the import: TFramedTransport [!]
from thrift.transport import TSocket
from hbase import Hbase
   # -- This hbase module is compiled using the thrift(1) command (version >= 0.10 [!])
   #    and a Hbase.thrift file (obtained from http://archive.apache.org/dist/hbase/
   # -- Also, your "pip freeze | grep '^thrift='" should show a version of >= 0.10 [!]
   #    if you want Python3 support.

(host,port) = ("vps00","9090")
transport = TFramedTransport(TSocket.TSocket(host, port))
protocol  = TCompactProtocol.TCompactProtocol(transport)
client = Hbase.Client(protocol)

transport.open()

# Do stuff here ...
print(client.getTableNames()) # Works. Output: [b'ns1:mytable', ]

transport.close()

我希望这能让人们免于遭受我所经历的痛苦。 =:)

学分

我最近在使用CDH 6.3.2HBase的时候也遇到了这个问题。仅仅遵循上面的配置是不够的。还需要关闭hbase.regionserver.thrift.httphbase.thrift.support.proxyuser才能连接成功