在 Snowflake 和 SQL 中合并完整和不完整的数据帧

Unioning complete and incomplete dataframes in Snowflake and SQL

所以我有一个看起来像这样的 df

DF3

ID    field1   field2  field3
001   banana        1       y
001    apple        1       y
004   orange       21       n
005   orange       32       y

现在这个 table 是 DF3,它是 Df2 的未来状态,如下所示:

DF2

ID    field1   field2
001   banana        1
001    apple        1
003    apple        1
004   orange       21
005   orange       32

然后 DF2 跟随 DF1....

DF1

ID    field1
001   banana
001    apple
002   banana
003    apple
004   orange
005   orange

把它想象成 DF3 是完整的记录。我想要 DF1 和 DF2 的不完整记录与 DF3 一起 table。

我希望我的最终结果如下所示:

ID    field1   field2  field3
001   banana        1       y
001    apple        1       y
002   banana     NULL    NULL
003    apple        1    NULL
004   orange       21       n
005   orange       32       y

我认为这可以通过 UNION 的一些组合来完成,但我正在努力研究如何在 snowflake 中这样做。

这看起来像 left join:

select df1.id, df1.field1, df2.field2, df3.field3
from df1 left join
     df2
     on df1.id = df2.id and
        df1.field1 = df2.field2 left join
     df3
     on df2.id = df3.id and
        df2.field1 = df3.field1 and
        df2.field2 = df3.field2;

嗯,我会进行完全外部联接,以防 ID 不完全同步。

with df3 as (
select '001' ID,    'banana' field1,  1 field2, 'y' field3
union all select '001' ID,     'apple' field1,  1 field2, 'y' field3
union all select '004' ID,    'orange' field1, 21 field2, 'n' field3
union all select '005' ID,    'orange' field1, 32 field2, 'y' field3)
,df2 as (
      select '001' ID,    'banana' field1,  1 field2      
union all select '001' ID,     'apple' field1,  1 field2      
union all select '003' ID,     'apple' field1,  1 field2      
union all select '004' ID,    'orange' field1, 21 field2      
union all select '005' ID,    'orange' field1, 32 field2 )     
, df1 as (select '001' ID,    'banana' field1
union all select '001' ID,     'apple' field1
union all select '002' ID,    'banana' field1
union all select '003' ID,     'apple' field1
union all select '004' ID,    'orange' field1
union all select '005' ID,    'orange' field1)
select 
coalesce(df1.id, df2.id, df3.id) ID,
coalesce(df1.field1, df2.field1, df3.field1) field1,
coalesce(df2.field2, df3.field2) field2,
df3.field3
from df1 full outer join df2 on df1.id = df2.id full outer join df3 on         
df1.id = df3.id 
group by 1,2,3,4