Snowflake 查询引擎策略在几个带查询条件

Snowflake query engine strategy on several with query conditions

我正在执行从 pyspark 查询到 snowflake 查询的迁移作业,想知道下面 A、B 选项之间哪个选项更好。

为了避免不必要的查询,如果没有那么明显的性能差异,我想选择 B 选项。

在 B 选项中,snowflake 查询引擎是否自动优化并在内部表现得像 A 选项?

一个选项

With A1 AS (select * from a1 where date='2021-10-20'),
A2 AS (select * from a2 where date='2021-10-20'),
A3 AS (select * from a3 where date='2021-10-20'),
A4 AS (select * from a4 where date='2021-10-20'),
A5 AS (select * from a5 where date='2021-10-20')
SELECT *
FROM final_merged_table

和B选项

With A1 AS (select * from a1),
A2 AS (select * from a2),
A3 AS (select * from a3),
A4 AS (select * from a4),
A5 AS (select * from a5)
SELECT *
FROM final_merged_table
WHERE date = '2021-10-20'

我们可以测试一下。首先,让我们构建一个包含一周日期和几百万行的 table:

create or replace table one_week2
as
select '2020-04-01'::date + (7*seq8()/100000000)::int day, random() data, random() data2, random() data3
from table(generator(rowcount => 100000000))

现在我们可以编写两个查询来检查这个 table:

选项 1:

With A1 AS (select * from one_week2 where day='2020-04-05'),
A2 AS (select * from one_week2 where day='2020-04-05'),
A3 AS (select * from one_week2 where day='2020-04-05'),
A4 AS (select * from one_week2 where day='2020-04-05'),
A5 AS (select * from one_week2 where day='2020-04-05'),
final_merged_table as (
    select * from a1 
    union all select * from a2
    union all select * from a3
    union all select * from a4
    union all select * from a5)

SELECT count(*)
FROM final_merged_table

选项 2:

With A1 AS (select * from one_week2),
A2 AS (select * from one_week2),
A3 AS (select * from one_week2),
A4 AS (select * from one_week2),
A5 AS (select * from one_week2),
final_merged_table as (
    select * from a1 
    union all select * from a2
    union all select * from a3
    union all select * from a4
    union all select * from a5)

SELECT count(*)
FROM final_merged_table
where day='2020-04-05'
;

当我们 运行 这些查询时,两者的配置文件看起来相同 - 因为过滤器已被下推:

选项 1 配置文件

选项 2 配置文件

总结

您可以信任 Snowflake 优化器。

信任很重要,但也要验证:有时优化器可能会被复杂的 CTE 弄糊涂。有时 Snowflake 工程师会优化优化器,今天不起作用的东西明天会更好。