在 hbase 中，大 table 的存在是否会影响其他较小 table 的性能？

Question

在我们较低的环境中，我们有一个具有 100 000 000 行的单个 table，并且这个 table 的直接扫描每秒可以 return 大约 2 800 行。在我们的生产环境中，我们有一个 table 的 100 000 000 行和另一个大约 40 亿行，并且扫描较小的 table 每秒仅产生 1000 行。在这两种情况下都没有其他 activity 发生，那么是否存在较大的 table 导致性能下降？

两个 table 都有一个列族，大的 table 有 400 列，但一条记录只会填充其中的 1 列。较小的 table 只有一列，并且该列始终被填充。

Answer 1

您可以尝试在hbase-site.xml中指定HBase分配给Scans的资源量。您可以分两步执行此操作： 1. 指定分配给读取（相对于写入）的资源百分比 2. 指定分配给 Scans（而不是 Gets）的 READS 资源的百分比

在下面的示例中，整个集群中有 96 个 CPU。您将他们 80% 的注意力分配给阅读，然后将 80% 的注意力分配给扫描。

看看这是否对您有任何影响。

   <property>
        <name>hbase.regionserver.handler.count</name>
        <value>96</value> <!-- roughly # of CPUs in the whole cluster -->
    </property>
    <property>
        <name>hbase.ipc.server.callqueue.read.ratio</name>
        <value>0.8</value>
    </property>
    <property>
        <name>hbase.ipc.server.callqueue.scan.ratio</name>
        <value>0.8</value>
    </property>

在 hbase 中，大 table 的存在是否会影响其他较小 table 的性能？

In hbase, can the presence of a large table affect the performance of other smaller tables?

hbase