在级联管道中 SQL NOT IN 的等价物是什么?

What is the equivalent of SQL NOT IN in Cascading Pipes?

我有两个文件有一个公共字段,我需要根据该字段值获取第二个文件值。

如何在此处添加where条件?

是否还有其他未使用的 PIPE?

文件 1:

tcno,date,amt
1234,3/10/2016,1000
1234,3/11/2016,400
23456,2/10/2016,1500

文件 2:

cno,fname,lname,city,phone,mail
1234,first,last,city,1234556,123@123.com

示例代码:

Pipe pipe1 = new Pipe("custPipe");
Pipe pipe2 = new Pipe("tscnPipe");
Fields cJoinField = new Fields("cno");
Fields tJoinField = new Fields("tcno");
Pipe pipe = new HashJoin(pipe1, cJoinField, pipe2, tJoinField,  new OuterJoin());
//HOW TO ADD WHERE CONDITION i.e. CNO IS NULL FROM SECOND FILE
Fields outFields = new Fields("tcno","tdate", "tamt");

我期待第一个文件最后一行的输出 [23456,2/10/2016,1500]

根据代码中的注释:

//HOW TO ADD WHERE CONDITION i.e. CNO IS NULL FROM SECOND FILE

尝试使用 FilterNull

HashJoin 步骤后将以下行添加到您的代码中:

FilterNull filterNull = new FilterNull();
pipe = new Each( pipe, cJoinField, filterNull );

类似于:

Pipe pipe1 = new Pipe("custPipe");
Pipe pipe2 = new Pipe("tscnPipe");
Fields cJoinField = new Fields("cno");
Fields tJoinField = new Fields("tcno");
Pipe pipe = new HashJoin(pipe1, cJoinField, pipe2, tJoinField,  new OuterJoin());

// Filter out those tuples which has cno as null
FilterNull filterNull = new FilterNull();
pipe = new Each( pipe, cJoinField, filterNull );

Fields outFields = new Fields("tcno","tdate", "tamt");