Snowflake 加入性能改进
Snowflake joins performance improvement
我需要在包含 1300 多列的 table 之上创建一个视图。新数据将每季度加载到 table(以百万行为单位)。创建视图时,我需要将其他 table 与基础 table 连接起来。我还需要在视图中添加一个最近的行指示符。
CREATE OR REPLACE SECURE VIEW VIEW_NAME AS
SELECT lkp_tbl.col1,base_tbl.col1,base_tbl.col2,base_tbl.col3,........,
base_tbl.col1334, 1 as Is_Latest_Quarter
FROM base_tbl full outer JOIN lkp_tbl
on base_tbl.CUST_ID = lkp_tbl.CUST_ID
where snapshot_dt=(select max(snapshot_dt) from base_tbl)
union all
SELECT lkp_tbl.col1,base_tbl.col1,base_tbl.col2,base_tbl.col3,........,
base_tbl.col1334,0 as Is_Latest_Quarter
FROM base_tbl full outer JOIN lkp_tbl
on base_tbl.CUST_ID = lkp_tbl.CUST_ID
where snapshot_dt!=(select max(snapshot_dt) from base_tbl);
创建此视图后,即使我们查询 100 行,查询性能也太慢了。有没有一种方法可以更有效地创建视图。如果不能,我该如何提高性能?
只用一个SELECT语句,用一个CASE语句计算Is_Latest_Quarter
已更新(几乎)实际 SQL
CREATE OR REPLACE SECURE VIEW VIEW_NAME AS
SELECT {list of columns you want to include}
,CASE WHEN snapshot_dt=(select max(snapshot_dt) from base_tbl) THEN 1
ELSE 0 END as Is_Latest_Quarter
FROM base_tbl
full outer JOIN lkp_tbl on base_tbl.CUST_ID = lkp_tbl.CUST_ID
或者,如果 Snowflake 不喜欢该内联子查询,您可以使用类似以下的 CTE:
CREATE OR REPLACE SECURE VIEW VIEW_NAME AS
WITH MAX_DATE AS (SELECT MAX(Ssnapshot_dt) AS max_snapshot_dt FROM base_tbl),
SELECT {list of columns you want to include}
,CASE WHEN max_date.max_snapshot_dt is not null THEN 1
ELSE 0 END as Is_Latest_Quarter
FROM base_tbl
full outer JOIN lkp_tbl on base_tbl.CUST_ID = lkp_tbl.CUST_ID
LEFT OUTER JOIN MAX_DATE ON base_tbl.snapshot_dt = max_date.max_snapshot_dt
我需要在包含 1300 多列的 table 之上创建一个视图。新数据将每季度加载到 table(以百万行为单位)。创建视图时,我需要将其他 table 与基础 table 连接起来。我还需要在视图中添加一个最近的行指示符。
CREATE OR REPLACE SECURE VIEW VIEW_NAME AS
SELECT lkp_tbl.col1,base_tbl.col1,base_tbl.col2,base_tbl.col3,........,
base_tbl.col1334, 1 as Is_Latest_Quarter
FROM base_tbl full outer JOIN lkp_tbl
on base_tbl.CUST_ID = lkp_tbl.CUST_ID
where snapshot_dt=(select max(snapshot_dt) from base_tbl)
union all
SELECT lkp_tbl.col1,base_tbl.col1,base_tbl.col2,base_tbl.col3,........,
base_tbl.col1334,0 as Is_Latest_Quarter
FROM base_tbl full outer JOIN lkp_tbl
on base_tbl.CUST_ID = lkp_tbl.CUST_ID
where snapshot_dt!=(select max(snapshot_dt) from base_tbl);
创建此视图后,即使我们查询 100 行,查询性能也太慢了。有没有一种方法可以更有效地创建视图。如果不能,我该如何提高性能?
只用一个SELECT语句,用一个CASE语句计算Is_Latest_Quarter
已更新(几乎)实际 SQL
CREATE OR REPLACE SECURE VIEW VIEW_NAME AS
SELECT {list of columns you want to include}
,CASE WHEN snapshot_dt=(select max(snapshot_dt) from base_tbl) THEN 1
ELSE 0 END as Is_Latest_Quarter
FROM base_tbl
full outer JOIN lkp_tbl on base_tbl.CUST_ID = lkp_tbl.CUST_ID
或者,如果 Snowflake 不喜欢该内联子查询,您可以使用类似以下的 CTE:
CREATE OR REPLACE SECURE VIEW VIEW_NAME AS
WITH MAX_DATE AS (SELECT MAX(Ssnapshot_dt) AS max_snapshot_dt FROM base_tbl),
SELECT {list of columns you want to include}
,CASE WHEN max_date.max_snapshot_dt is not null THEN 1
ELSE 0 END as Is_Latest_Quarter
FROM base_tbl
full outer JOIN lkp_tbl on base_tbl.CUST_ID = lkp_tbl.CUST_ID
LEFT OUTER JOIN MAX_DATE ON base_tbl.snapshot_dt = max_date.max_snapshot_dt