连接后根据辅助列中的最大计数生成结果

Generate Result based on max count in secondary column after a join

我有两个 table,它们之间有一个共同的密钥,还有很多其他重要信息;为了简单起见,我将使用组合 A 和组合 B。当满足组合时,table 具有最大记录数的那个应该是我收集信息的来源;在这种情况下说 ID。计数相同时优先级为Table1.

COMMONKEY 列是我 table 中的 combination/join 条件。

    (Table 1)

  SELECT '123' table1_id,'Comb A' commonkey from dual UNION
  SELECT '124' table1_id,'Comb A' commonkey from dual UNION
  SELECT '125' table1_id,'Comb A' commonkey from dual UNION
  SELECT '126' table1_id,'Comb A' commonkey from dual UNION
  SELECT '215' table1_id,'Comb B' commonkey from dual UNION
  SELECT '216' table1_id,'Comb B' commonkey from dual UNION
  SELECT '559' table1_id,'Random Combination 1' commonkey from dual UNION
  SELECT '560' table1_id,'Random Combination 2' commonkey from dual ;   
                                 
    ( Table 2 )     
        
  SELECT 'abc1' table2_id,'Comb A' commonkey from dual  UNION
  SELECT 'abc2' table2_id,'Comb A' commonkey from dual  UNION
  SELECT 'abc3' table2_id,'Comb A' commonkey from dual  UNION
  SELECT 'abc4' table2_id,'Comb A' commonkey from dual  UNION
  SELECT 'xyz1' table2_id,'Comb B' commonkey from dual  UNION
  SELECT 'xyz2' table2_id,'Comb B' commonkey from dual  UNION
  SELECT 'xyz3' table2_id,'Comb B' commonkey from dual  UNION
  SELECT 'xyz2' table2_id,'Comb B' commonkey from dual  UNION 
  SELECT '416abc1' table2_id,'Random Combination 91' commonkey from dual UNION
  SELECT '416abc2' table2_id,'Random Combination 92' commonkey from dual;
  
    
    

Result Set Expected :

ID        COMMONKEY         
123       Comb A            
124       Comb A            
125       Comb A            
126       Comb A            
xyz1      Comb B            
xyz2      Comb B            
xyz3      Comb B            
559       Random Combination 1          
560       Random Combination 1          
416abc1   Random Combination 91         
416abc2   Random Combination 92 

Updated Image :

(该图显示了 excel 中跟踪数据的屏幕截图;要求和策略已用颜色映射以使其易于理解)

我需要使用 SQL 生成结果集,如下所示:

当table1.commonkey = table2.commonkey命中时,我需要-

编辑:我最初走的路线是

  a left join b where b.key IS null ;
  a full outer join b where b.key IS NULL or a.key is NULL ;

使用 A-B 或 B-A 结果集实现解决方法,但这两种方法都是错误的。收集 Delta 集或 Exclusion 集并不顺利。

这是一种选择;查看代码中的评论

SQL> with
  2  -- sample data
  3  a (id, ckey) as
  4    (select '123', 'ca' from dual union all
  5     select '124', 'ca' from dual union all
  6     select '125', 'ca' from dual union all
  7     select '126', 'ca' from dual union all
  8     select '215', 'cb' from dual union all
  9     select '216', 'cb' from dual union all
 10     select '551', 'r1' from dual union all
 11     select '552', 'r2' from dual
 12    ),
 13  b (id, ckey) as
 14    (select 'abc1', 'ca' from dual union all
 15     select 'abc2', 'ca' from dual union all
 16     select 'abc3', 'ca' from dual union all
 17     select 'abc4', 'ca' from dual union all
 18     select 'xyz1', 'cb' from dual union all
 19     select 'xyz2', 'cb' from dual union all
 20     select 'xyz3', 'cb' from dual union all
 21     select '9991', 'r3' from dual union all
 22     select '9992', 'r4' from dual
 23    ),

 24  -- count rows per each CKEY (common key)
 25  tempa as
 26    (select id, ckey, count(*) over (partition by ckey) cnt
 27     from a
 28    ),
 29  tempb as
 30    (select id, ckey, count(*) over (partition by ckey) cnt
 31     from b
 32    )
 33  -- final query
 34  select distinct
 35    case when a.cnt >= b.cnt then a.id
 36         else b.id
 37    end id,
 38    a.ckey
 39  from tempa a join tempb b on b.ckey = a.ckey
 40  union all
 41  select ckey, id from a
 42    where not exists (select null from b where a.ckey = b.ckey)
 43  union all
 44  select ckey, id from b
 45    where not exists (select null from a where a.ckey = b.ckey)
 46  order by 1, 2;

这导致

ID   CKEY
---- -----
r1   551
r2   552
r3   9991
r4   9992
xyz1 cb
xyz2 cb
xyz3 cb
123  ca
124  ca
125  ca
126  ca

11 rows selected.

SQL>