如何解决 happybase "TApplicationException: Internal error processing mutateRows" 错误?
How do I get around the happybase "TApplicationException: Internal error processing mutateRows" error?
我正在使用 happybase 连接到我的 Hbase 数据库。我制作了一个名为 'irisSample' 的示例 table。这是我遇到问题的代码部分-
import happybase
from happybase import *
import json
connection = happybase.Connection('<ip-address>', '9090')
table = connection.table('irisSample')
n = 0
x = 1
for u in y:
data = {'petalWidth':points['petalWidth'][n], 'sepalLength':points['sepalLength'][n],
'petalLength':points['petalLength'][n], 'label': u}
row = 'row' + str(x)
table.put(row, {'flowers': str(data)})
n += 1
x += 1
我得到以下信息。
TApplicationException Traceback (most recent call last)
<ipython-input-15-94741a8b04dc> in <module>()
9 'petalLength':points['petalLength'][n], 'label': u}
10 row = 'row' + str(x)
---> 11 table.put(row, {'flowers': str(data)})
12 n += 1
13 x += 1
/root/anaconda/lib/python2.7/site-packages/happybase/table.pyc in put(self, row, data, timestamp, wal)
437 """
438 with self.batch(timestamp=timestamp, wal=wal) as batch:
--> 439 batch.put(row, data)
440
441 def delete(self, row, columns=None, timestamp=None, wal=True):
/root/anaconda/lib/python2.7/site-packages/happybase/batch.pyc in __exit__(self, exc_type, exc_value, traceback)
130 return
131
--> 132 self.send()
/root/anaconda/lib/python2.7/site-packages/happybase/batch.pyc in send(self)
53 self._table.name, self._mutation_count, len(bms))
54 if self._timestamp is None:
---> 55 self._table.connection.client.mutateRows(self._table.name, bms, {})
56 else:
57 self._table.connection.client.mutateRowsTs(
/root/anaconda/lib/python2.7/site-packages/happybase/hbase/Hbase.pyc in mutateRows(self, tableName, rowBatches, attributes)
1574 """
1575 self.send_mutateRows(tableName, rowBatches, attributes)
-> 1576 self.recv_mutateRows()
1577
1578 def send_mutateRows(self, tableName, rowBatches, attributes):
/root/anaconda/lib/python2.7/site-packages/happybase/hbase/Hbase.pyc in recv_mutateRows(self)
1592 x.read(self._iprot)
1593 self._iprot.readMessageEnd()
-> 1594 raise x
1595 result = mutateRows_result()
1596 result.read(self._iprot)
TApplicationException: Internal error processing mutateRows
我也试过 json.dumps(data)
而不是 str(data)
,后者引发了同样的异常。
从我收集的信息来看,这似乎更像是一个 Thrift 问题,但我可能是错的。我可能不得不看看 starbase。我不知道,这就是我问你们的原因。
我找到了答案here。它要求 REST 框架在未使用的端口上 运行。然后我连接到那个端口,能够看到 Hbase 中所有表的列表并将数据插入数据库。命令是./hbase-daemon.sh start rest -p <unused port number>
。我在任何地方都找不到答案,甚至几天都没有人发表评论,所以我希望这对你有帮助!这是代码。我最终使用了 starbase
.
import starbase
from starbase import Connection
c = Connection(host='<ip-address>', port=8000)
8000 是默认端口。
t = c.table('irisSample')
for flower in range(len(labels)):
data = {'petalWidth':X[flower][3], 'petalLength':X[flower][2],
'sepalLength':X[flower][0], 'sepalWidth':X[flower][1], 'cluster': labels[flower]}
n += 1
row = 'row' + str(n)
t.insert(row, {'flowers': data})
在您的 "data" 对象中,您似乎没有提供列族和列名。
data = {'petalWidth':points['petalWidth'[n], 'sepalLength':points['sepalLength'][n], 'petalLength':points['petalLength'][n], 'label': u}
您需要在每个列值之前有列族。例如,假设您的列族是 "cf",那么您想要 'cf:petalWidth'.
而不是 'petalWidth'
data = {'cf:petalWidth':points['petalWidth'[n], 'cf:sepalLength':points['sepalLength'][n], 'cf:petalLength':points['petalLength'][n], 'cf:label': u}
这样做修复了我的 mutateRows 错误。
我正在使用 happybase 连接到我的 Hbase 数据库。我制作了一个名为 'irisSample' 的示例 table。这是我遇到问题的代码部分-
import happybase
from happybase import *
import json
connection = happybase.Connection('<ip-address>', '9090')
table = connection.table('irisSample')
n = 0
x = 1
for u in y:
data = {'petalWidth':points['petalWidth'][n], 'sepalLength':points['sepalLength'][n],
'petalLength':points['petalLength'][n], 'label': u}
row = 'row' + str(x)
table.put(row, {'flowers': str(data)})
n += 1
x += 1
我得到以下信息。
TApplicationException Traceback (most recent call last)
<ipython-input-15-94741a8b04dc> in <module>()
9 'petalLength':points['petalLength'][n], 'label': u}
10 row = 'row' + str(x)
---> 11 table.put(row, {'flowers': str(data)})
12 n += 1
13 x += 1
/root/anaconda/lib/python2.7/site-packages/happybase/table.pyc in put(self, row, data, timestamp, wal)
437 """
438 with self.batch(timestamp=timestamp, wal=wal) as batch:
--> 439 batch.put(row, data)
440
441 def delete(self, row, columns=None, timestamp=None, wal=True):
/root/anaconda/lib/python2.7/site-packages/happybase/batch.pyc in __exit__(self, exc_type, exc_value, traceback)
130 return
131
--> 132 self.send()
/root/anaconda/lib/python2.7/site-packages/happybase/batch.pyc in send(self)
53 self._table.name, self._mutation_count, len(bms))
54 if self._timestamp is None:
---> 55 self._table.connection.client.mutateRows(self._table.name, bms, {})
56 else:
57 self._table.connection.client.mutateRowsTs(
/root/anaconda/lib/python2.7/site-packages/happybase/hbase/Hbase.pyc in mutateRows(self, tableName, rowBatches, attributes)
1574 """
1575 self.send_mutateRows(tableName, rowBatches, attributes)
-> 1576 self.recv_mutateRows()
1577
1578 def send_mutateRows(self, tableName, rowBatches, attributes):
/root/anaconda/lib/python2.7/site-packages/happybase/hbase/Hbase.pyc in recv_mutateRows(self)
1592 x.read(self._iprot)
1593 self._iprot.readMessageEnd()
-> 1594 raise x
1595 result = mutateRows_result()
1596 result.read(self._iprot)
TApplicationException: Internal error processing mutateRows
我也试过 json.dumps(data)
而不是 str(data)
,后者引发了同样的异常。
从我收集的信息来看,这似乎更像是一个 Thrift 问题,但我可能是错的。我可能不得不看看 starbase。我不知道,这就是我问你们的原因。
我找到了答案here。它要求 REST 框架在未使用的端口上 运行。然后我连接到那个端口,能够看到 Hbase 中所有表的列表并将数据插入数据库。命令是./hbase-daemon.sh start rest -p <unused port number>
。我在任何地方都找不到答案,甚至几天都没有人发表评论,所以我希望这对你有帮助!这是代码。我最终使用了 starbase
.
import starbase
from starbase import Connection
c = Connection(host='<ip-address>', port=8000)
8000 是默认端口。
t = c.table('irisSample')
for flower in range(len(labels)):
data = {'petalWidth':X[flower][3], 'petalLength':X[flower][2],
'sepalLength':X[flower][0], 'sepalWidth':X[flower][1], 'cluster': labels[flower]}
n += 1
row = 'row' + str(n)
t.insert(row, {'flowers': data})
在您的 "data" 对象中,您似乎没有提供列族和列名。
data = {'petalWidth':points['petalWidth'[n], 'sepalLength':points['sepalLength'][n], 'petalLength':points['petalLength'][n], 'label': u}
您需要在每个列值之前有列族。例如,假设您的列族是 "cf",那么您想要 'cf:petalWidth'.
而不是 'petalWidth'data = {'cf:petalWidth':points['petalWidth'[n], 'cf:sepalLength':points['sepalLength'][n], 'cf:petalLength':points['petalLength'][n], 'cf:label': u}
这样做修复了我的 mutateRows 错误。