在一次扫描中使用 MUST_PASS_ONE/ALL 运算符组合两个过滤列表

Comibine two FilterLists with MUST_PASS_ONE/ALL operator in a single Scan

考虑 hbase shell scan 'table' 结果:

ROW COLUMN+CELL
000 column=F:Q, timestamp=1519299345645, value=a
001 column=F:Q, timestamp=1519299345645, value=b
010 column=F:Q, timestamp=1519299345645, value=c
011 column=F:Q, timestamp=1519299345645, value=b
100 column=F:Q, timestamp=1519299345645, value=a
110 column=F:Q, timestamp=1519299345645, value=c
200 column=F:Q, timestamp=1519299345645, value=b
210 column=F:Q, timestamp=1519299345645, value=a

我想要的 scan 结果:

上面的例子是:

ROW COLUMN+CELL
000 column=F:Q, timestamp=1519299345645, value=a
001 column=F:Q, timestamp=1519299345645, value=b
011 column=F:Q, timestamp=1519299345645, value=b
100 column=F:Q, timestamp=1519299345645, value=a

在 hbase shell 中,它将是(忽略所有 \s\n,我为了更好的可读性 ):

import org.apache.hadoop.hbase.filter.CompareFilter
import org.apache.hadoop.hbase.filter.SingleColumnValueFilter
import org.apache.hadoop.hbase.util.Bytes

scan 'table' { 
  COLUMNS => 'F:Q', 
  FILTER => "
    (
      (PrefixFilter('0')) 
      OR 
      (PrefixFilter('1'))
    ) 
    AND 
    (
      SingleColumnValuFilter(
         Bytes.toBytes("F"),
         Bytes.toBytes("Q"),
         CompareFilter::CompareOp.valueOf('EQUAL'),
         Bytes.toBytes("a")
      )
      OR 
      SingleColumnValuFilter(
         Bytes.toBytes("F"),
         Bytes.toBytes("Q"),
         CompareFilter::CompareOp.valueOf('EQUAL'),
         Bytes.toBytes("b")
      )
    )
  "
}

考虑到我在 java 中有两个过滤器列表:

List<Filter> prefixFilters            = new ArrayList<>();
List<Filter> singleColumnValueFilters = new ArrayList();

PrefixFilter one  = new PrefixFilter(Bytes.toBytes("1"));
PrefixFilter zero = new PrefixFilter(Bytes.toBytes("0"));

SingleColumnValueFilter a = new SingleColumnValueFilter(
    Bytes.toBytes("F"),
    Bytes.toBytes("Q"),
    CompareFilter.CompareOp.EQUAL,
    Bytes.toBytes("a") 
);

SingleColumnValueFilter b = new SingleColumnValueFilter(
    Bytes.toBytes("F"),
    Bytes.toBytes("Q"),
    CompareFilter.CompareOp.EQUAL,
    Bytes.toBytes("b") 
);

prefixFilters.add(zero);
prefixFilters.add(one);

singleColumnValueFilters.add(a);
singleColumnValueFilters.add(b);

FilterList prefixFiltersList = new FitlerList(FilterList.Operator.MUST_PASS_ONE, prefixFilters);
FilterList singleColumnValueFiltersList = new FitlerList(FilterList.Operator.MUST_PASS_ONE, singleColumnValueFilters);

问题: 我如何将它们组合成一个 scan.setFilter()AND 运算符,就像我在 shell 中所做的那样?


我希望为此有特殊的 FilterList 构造函数,它将接受逻辑比较器 (AND / OR) 和多个 List<Filter> 参数。由于有 none,我卡住了。

最后添加

FilterList filters = new FilterList(FilterList.Operator.MUST_PASS_ALL);
filters.addFilter(prefixFiltersList);
filters.addFilter(singleColumnValueFiltersList);

scan.setFilter(filters);

这确保两个 FilterList 都是 运行,并且 MUST_PASS_ALL 充当 AND 条件。

为什么这行得通?根据 FilterList JavaDoc:

Since you can use Filter Lists as children of Filter Lists, you can create a hierarchy of filters to be evaluated.