返回不包含在两列中的任何元素 SQL

Returning any element that isn’t contained in two columns SQL

我在 table 中有两列,我想创建第三列,其中包含两列中未包含的任何元素。例如:两列的第一行如下所示:

Col1: [‘apple’,’banana’,’orange’,’pear’]
Col2: [‘apple’,’banana’]

它会 return:

Col3: [‘orange’, ‘pear’]

本质上与array_intersect功能相反。我在 php 中看到了 array_diff 所以我想知道 sql 中是否有等效的函数?

如果您有主键,那么我认为这会满足您的要求:

select t.pk, collect_set(case when c2.el is null then c1.el end)
from (t lateral view
      explode(t.col1) c1 as el 
     ) left join
     (t t2 lateral view
      explode(t2.col2) c2 as el
     )
     on t.pk = t2.pk and
        c1.el = c2.el
group by t.pk;

分解 col1 并使用 array_contains+case 语句,assemble 数组再次使用 collect_set 或 collect_list.

演示:

with your_data as (--Test data. Use your table instead of this
select stack(1,
array('apple','banana','orange','pear'),
array('apple','banana')
) as (col1, col2)
)

select col1, col2, 
       collect_set(case when array_contains(t.col2, e.col1_elem) then null else e.col1_elem end) as col3
  from your_data t
       lateral view explode(t.col1) e as col1_elem
group by col1,  col2

结果:

col1                                 col2                col3
["apple","banana","orange","pear"]  ["apple","banana"]  ["orange","pear"]