PIG:按之前的 table 结果过滤配置单元 table
PIG: Filter hive table by previous table result
我需要查询一个 HIVE table 并使用前一列的一列过滤另一个 table。
示例:
A = LOAD 'db.table1' USING org.apache.hive.hcatalog.pig.HCatLoader();
filterA = filter A by (id=='123');
B = LOAD 'db.table2' USING org.apache.hive.hcatalog.pig.HCatLoader();
//the problem is here. filterA has many rows. I need to apply filter for each of the row.
filterB = filter B by (id==filterA.id);
Data in A:
tabid id dept location
1 1 IS SJ
2 4 CS SF
3 5 EC MD
Data in B:
tabid id name address
1 4 john 123 S AVE
2 5 jane 456 N BLVD
3 9 nick 789 GREAT LAKE DR
Expected Result:
tabid id name address
1 4 john 123 S AVE
2 5 jane 456 N BLVD
正如评论中所问,听起来您正在寻找的是连接。对不起,如果我误解了你的问题。
A = LOAD 'db.table1' USING org.apache.hive.hcatalog.pig.HCatLoader();
B = LOAD 'db.table2' USING org.apache.hive.hcatalog.pig.HCatLoader();
C = JOIN A by id, B by id;
我需要查询一个 HIVE table 并使用前一列的一列过滤另一个 table。
示例:
A = LOAD 'db.table1' USING org.apache.hive.hcatalog.pig.HCatLoader();
filterA = filter A by (id=='123');
B = LOAD 'db.table2' USING org.apache.hive.hcatalog.pig.HCatLoader();
//the problem is here. filterA has many rows. I need to apply filter for each of the row.
filterB = filter B by (id==filterA.id);
Data in A:
tabid id dept location
1 1 IS SJ
2 4 CS SF
3 5 EC MD
Data in B:
tabid id name address
1 4 john 123 S AVE
2 5 jane 456 N BLVD
3 9 nick 789 GREAT LAKE DR
Expected Result:
tabid id name address
1 4 john 123 S AVE
2 5 jane 456 N BLVD
正如评论中所问,听起来您正在寻找的是连接。对不起,如果我误解了你的问题。
A = LOAD 'db.table1' USING org.apache.hive.hcatalog.pig.HCatLoader();
B = LOAD 'db.table2' USING org.apache.hive.hcatalog.pig.HCatLoader();
C = JOIN A by id, B by id;