连接后根据辅助列中的最大计数生成结果
Generate Result based on max count in secondary column after a join
我有两个 table,它们之间有一个共同的密钥,还有很多其他重要信息;为了简单起见,我将使用组合 A 和组合 B。当满足组合时,table 具有最大记录数的那个应该是我收集信息的来源;在这种情况下说 ID。计数相同时优先级为Table1.
COMMONKEY 列是我 table 中的 combination/join 条件。
(Table 1)
SELECT '123' table1_id,'Comb A' commonkey from dual UNION
SELECT '124' table1_id,'Comb A' commonkey from dual UNION
SELECT '125' table1_id,'Comb A' commonkey from dual UNION
SELECT '126' table1_id,'Comb A' commonkey from dual UNION
SELECT '215' table1_id,'Comb B' commonkey from dual UNION
SELECT '216' table1_id,'Comb B' commonkey from dual UNION
SELECT '559' table1_id,'Random Combination 1' commonkey from dual UNION
SELECT '560' table1_id,'Random Combination 2' commonkey from dual ;
( Table 2 )
SELECT 'abc1' table2_id,'Comb A' commonkey from dual UNION
SELECT 'abc2' table2_id,'Comb A' commonkey from dual UNION
SELECT 'abc3' table2_id,'Comb A' commonkey from dual UNION
SELECT 'abc4' table2_id,'Comb A' commonkey from dual UNION
SELECT 'xyz1' table2_id,'Comb B' commonkey from dual UNION
SELECT 'xyz2' table2_id,'Comb B' commonkey from dual UNION
SELECT 'xyz3' table2_id,'Comb B' commonkey from dual UNION
SELECT 'xyz2' table2_id,'Comb B' commonkey from dual UNION
SELECT '416abc1' table2_id,'Random Combination 91' commonkey from dual UNION
SELECT '416abc2' table2_id,'Random Combination 92' commonkey from dual;
Result Set Expected :
ID COMMONKEY
123 Comb A
124 Comb A
125 Comb A
126 Comb A
xyz1 Comb B
xyz2 Comb B
xyz3 Comb B
559 Random Combination 1
560 Random Combination 1
416abc1 Random Combination 91
416abc2 Random Combination 92
Updated Image :
(该图显示了 excel 中跟踪数据的屏幕截图;要求和策略已用颜色映射以使其易于理解)
我需要使用 SQL 生成结果集,如下所示:
当table1.commonkey = table2.commonkey命中时,我需要-
- 如果table1有10个ID,table2有5个ID -> 从table1中选择10个ID。
- 如果table1有15个ID,table2有30个ID -> 从table2中选择30个ID。
- 如果table1有4个ID,table2有4个ID -> 从table1中选择4个ID。
(当相等时,选择 table1 个 ID)
- 当没有与公用键匹配时,防止交叉连接并将行集线性添加到结果 table。
编辑:我最初走的路线是
a left join b where b.key IS null ;
a full outer join b where b.key IS NULL or a.key is NULL ;
使用 A-B 或 B-A 结果集实现解决方法,但这两种方法都是错误的。收集 Delta 集或 Exclusion 集并不顺利。
这是一种选择;查看代码中的评论
SQL> with
2 -- sample data
3 a (id, ckey) as
4 (select '123', 'ca' from dual union all
5 select '124', 'ca' from dual union all
6 select '125', 'ca' from dual union all
7 select '126', 'ca' from dual union all
8 select '215', 'cb' from dual union all
9 select '216', 'cb' from dual union all
10 select '551', 'r1' from dual union all
11 select '552', 'r2' from dual
12 ),
13 b (id, ckey) as
14 (select 'abc1', 'ca' from dual union all
15 select 'abc2', 'ca' from dual union all
16 select 'abc3', 'ca' from dual union all
17 select 'abc4', 'ca' from dual union all
18 select 'xyz1', 'cb' from dual union all
19 select 'xyz2', 'cb' from dual union all
20 select 'xyz3', 'cb' from dual union all
21 select '9991', 'r3' from dual union all
22 select '9992', 'r4' from dual
23 ),
24 -- count rows per each CKEY (common key)
25 tempa as
26 (select id, ckey, count(*) over (partition by ckey) cnt
27 from a
28 ),
29 tempb as
30 (select id, ckey, count(*) over (partition by ckey) cnt
31 from b
32 )
33 -- final query
34 select distinct
35 case when a.cnt >= b.cnt then a.id
36 else b.id
37 end id,
38 a.ckey
39 from tempa a join tempb b on b.ckey = a.ckey
40 union all
41 select ckey, id from a
42 where not exists (select null from b where a.ckey = b.ckey)
43 union all
44 select ckey, id from b
45 where not exists (select null from a where a.ckey = b.ckey)
46 order by 1, 2;
这导致
ID CKEY
---- -----
r1 551
r2 552
r3 9991
r4 9992
xyz1 cb
xyz2 cb
xyz3 cb
123 ca
124 ca
125 ca
126 ca
11 rows selected.
SQL>
我有两个 table,它们之间有一个共同的密钥,还有很多其他重要信息;为了简单起见,我将使用组合 A 和组合 B。当满足组合时,table 具有最大记录数的那个应该是我收集信息的来源;在这种情况下说 ID。计数相同时优先级为Table1.
COMMONKEY 列是我 table 中的 combination/join 条件。
(Table 1)
SELECT '123' table1_id,'Comb A' commonkey from dual UNION
SELECT '124' table1_id,'Comb A' commonkey from dual UNION
SELECT '125' table1_id,'Comb A' commonkey from dual UNION
SELECT '126' table1_id,'Comb A' commonkey from dual UNION
SELECT '215' table1_id,'Comb B' commonkey from dual UNION
SELECT '216' table1_id,'Comb B' commonkey from dual UNION
SELECT '559' table1_id,'Random Combination 1' commonkey from dual UNION
SELECT '560' table1_id,'Random Combination 2' commonkey from dual ;
( Table 2 )
SELECT 'abc1' table2_id,'Comb A' commonkey from dual UNION
SELECT 'abc2' table2_id,'Comb A' commonkey from dual UNION
SELECT 'abc3' table2_id,'Comb A' commonkey from dual UNION
SELECT 'abc4' table2_id,'Comb A' commonkey from dual UNION
SELECT 'xyz1' table2_id,'Comb B' commonkey from dual UNION
SELECT 'xyz2' table2_id,'Comb B' commonkey from dual UNION
SELECT 'xyz3' table2_id,'Comb B' commonkey from dual UNION
SELECT 'xyz2' table2_id,'Comb B' commonkey from dual UNION
SELECT '416abc1' table2_id,'Random Combination 91' commonkey from dual UNION
SELECT '416abc2' table2_id,'Random Combination 92' commonkey from dual;
Result Set Expected :
ID COMMONKEY
123 Comb A
124 Comb A
125 Comb A
126 Comb A
xyz1 Comb B
xyz2 Comb B
xyz3 Comb B
559 Random Combination 1
560 Random Combination 1
416abc1 Random Combination 91
416abc2 Random Combination 92
Updated Image :
(该图显示了 excel 中跟踪数据的屏幕截图;要求和策略已用颜色映射以使其易于理解)
我需要使用 SQL 生成结果集,如下所示:
当table1.commonkey = table2.commonkey命中时,我需要-
- 如果table1有10个ID,table2有5个ID -> 从table1中选择10个ID。
- 如果table1有15个ID,table2有30个ID -> 从table2中选择30个ID。
- 如果table1有4个ID,table2有4个ID -> 从table1中选择4个ID。 (当相等时,选择 table1 个 ID)
- 当没有与公用键匹配时,防止交叉连接并将行集线性添加到结果 table。
编辑:我最初走的路线是
a left join b where b.key IS null ;
a full outer join b where b.key IS NULL or a.key is NULL ;
使用 A-B 或 B-A 结果集实现解决方法,但这两种方法都是错误的。收集 Delta 集或 Exclusion 集并不顺利。
这是一种选择;查看代码中的评论
SQL> with
2 -- sample data
3 a (id, ckey) as
4 (select '123', 'ca' from dual union all
5 select '124', 'ca' from dual union all
6 select '125', 'ca' from dual union all
7 select '126', 'ca' from dual union all
8 select '215', 'cb' from dual union all
9 select '216', 'cb' from dual union all
10 select '551', 'r1' from dual union all
11 select '552', 'r2' from dual
12 ),
13 b (id, ckey) as
14 (select 'abc1', 'ca' from dual union all
15 select 'abc2', 'ca' from dual union all
16 select 'abc3', 'ca' from dual union all
17 select 'abc4', 'ca' from dual union all
18 select 'xyz1', 'cb' from dual union all
19 select 'xyz2', 'cb' from dual union all
20 select 'xyz3', 'cb' from dual union all
21 select '9991', 'r3' from dual union all
22 select '9992', 'r4' from dual
23 ),
24 -- count rows per each CKEY (common key)
25 tempa as
26 (select id, ckey, count(*) over (partition by ckey) cnt
27 from a
28 ),
29 tempb as
30 (select id, ckey, count(*) over (partition by ckey) cnt
31 from b
32 )
33 -- final query
34 select distinct
35 case when a.cnt >= b.cnt then a.id
36 else b.id
37 end id,
38 a.ckey
39 from tempa a join tempb b on b.ckey = a.ckey
40 union all
41 select ckey, id from a
42 where not exists (select null from b where a.ckey = b.ckey)
43 union all
44 select ckey, id from b
45 where not exists (select null from a where a.ckey = b.ckey)
46 order by 1, 2;
这导致
ID CKEY
---- -----
r1 551
r2 552
r3 9991
r4 9992
xyz1 cb
xyz2 cb
xyz3 cb
123 ca
124 ca
125 ca
126 ca
11 rows selected.
SQL>