读取 hbase 表时挂起 Mapreduce 作业

Question

我有一个这样设置的 4 节点 hadoop 分布式集群（包括 hbase）。

node1- namenode + hbase master + zookeeper
node2-资源管理器
node3- datanode1+hbase regionserver1+nodemanager
node4- datenode2+hbase regionserver2+nodemanager

集群设置似乎很好，因为所有的 WEB 用户界面（hbase、名称节点、资源管理器）都在准备中。现在，当我尝试提交 reads/writes hbase 表的 mapreduce 作业时，它会被挂起。它不断超时 但是，如果我在我的 mapreduce 代码中明确提及 hbase 凭据并将它们设置在作业中，那么同样的工作工作正常

Configuration conf =  HBaseConfiguration.create();
conf.set("hbase.zookeeper.quorum", "10.211.55.101");
conf.set("hbase.zookeeper.property.clientPort","2181");
conf.set("hbase.master", "10.211.55.101:60000");
//10.211.55.101 is the ipaddress of node1

这些属性已在 node1、node3 和 node4 上的 hbase 配置中设置。现在我的问题是 我是否需要在只有资源管理器运行的 node2 上设置任何关于 hbase 配置的东西？为什么在代码显式中设置 hbase 配置时相同的作业工作正常

Answer 1

HBaseConfiguration.create() 方法加载 hbase-site.xml 中的配置。确保在 Node2 的类路径中有 hbase-site.xml 可用。

HBase 文档中指定了以下内容here

The configuration used by a Java client is kept in an HBaseConfiguration instance. The factory method on HBaseConfiguration, HBaseConfiguration.create();, on invocation, will read in the content of the first hbase-site.xml found on the client's CLASSPATH, if one is present (Invocation will also factor in any hbase-default.xml found; an hbase-default.xml ships inside the hbase.X.X.X.jar)

读取 hbase 表时挂起 Mapreduce 作业

Hanging Mapreduce job while reading hbase tables

hadoop

hbase

mapreduce

distributed-computing

bigdata