PIG:按之前的 table 结果过滤配置单元 table

PIG: Filter hive table by previous table result

我需要查询一个 HIVE table 并使用前一列的一列过滤另一个 table。

示例:

A = LOAD 'db.table1' USING org.apache.hive.hcatalog.pig.HCatLoader();

filterA = filter A by (id=='123');

B = LOAD 'db.table2' USING org.apache.hive.hcatalog.pig.HCatLoader();

//the problem is here. filterA has many rows. I need to apply filter for each of the row.

filterB = filter B by (id==filterA.id);

Data in A:

tabid id dept location

1 1 IS SJ

2 4 CS SF

3 5 EC MD

Data in B:

tabid id name address

1 4 john 123 S AVE

2 5 jane 456 N BLVD

3 9 nick 789 GREAT LAKE DR

Expected Result:

tabid id name address

1 4 john 123 S AVE

2 5 jane 456 N BLVD

正如评论中所问,听起来您正在寻找的是连接。对不起,如果我误解了你的问题。

A = LOAD 'db.table1' USING org.apache.hive.hcatalog.pig.HCatLoader();
B = LOAD 'db.table2' USING org.apache.hive.hcatalog.pig.HCatLoader();
C = JOIN A by id, B by id;