如何从两个配置单元数组中获取不匹配的元素

How to get unmatched elements from two hive arrays

我正在寻找一个配置单元函数和查询,以从配置单元中的两个数组中获取不匹配的元素。假设数组是

A = ["Hello", "earth"]
B = ["Hello", "mars"]
Expected output is ["earth", "mars"] or ["mars", "earth"]

另一个例子

A = ["Hello", "world"]
B = ["Hello", "world", "!"]
Expected output is ["!"]

在子查询、FULL JOIN 子查询中分解两个数组并过滤不匹配的记录,然后使用 collect_set 聚合它们以获得数组。

演示:

with your_data as (
select array ("Hello", "earth") A,
array("Hello", "mars") B
)

select collect_set(coalesce(A.element_A,B.element_B)) as result
from
(select element_A 
 from your_data d lateral view explode(A) e as element_A
)A

FULL JOIN 

(select element_B 
 from your_data d lateral view explode(B) e as element_B
)B 
ON A.element_A=B.element_B

WHERE A.element_A is NULL OR B.element_B is NULL

结果:

["earth","mars"]