断言一组列可以形成 Snowflake 中的主键的最佳方法是什么?
What is the best way to assert that a set of columns could form a primary key in Snowflake?
臭名昭著的主键约束是 not enforced in snowflake sql:
-- Generating a table with 4 rows that contain duplicates and NULLs:
CREATE OR REPLACE TEMP TABLE PRIMARY_KEY_TEST AS
SELECT
*
FROM (
SELECT 1 AS PK, 'TEST_TEXT' AS TEXT
UNION ALL SELECT 1 AS PK, 'TEST_TEXT' AS TEXT
UNION ALL SELECT NULL AS PK, NULL AS TEXT
UNION ALL SELECT NULL AS PK, NULL AS TEXT
)
;
SELECT *
FROM PRIMARY_KEY_TEST
;
PK
TEXT
1
TEST_TEXT
1
TEST_TEXT
NULL
NULL
NULL
NULL
-- These constraints will NOT throw any errors in Snowflake
ALTER TABLE PRIMARY_KEY_TEST ADD PRIMARY KEY (PK);
ALTER TABLE PRIMARY_KEY_TEST ADD UNIQUE (TEXT);
然而,知道一组列的值对于每一行 uniuqe 并且 never NULL
是至关重要的更新一组数据时检查。
所以我正在寻找一段易于编写和阅读(最好是 1-2 行)的代码(可能基于某些 Snowflake 函数),如果一组列不再构成可行的主列,则会抛出错误键入雪花 SQL.
有什么建议吗?
您可以通过在您不希望为空的列上添加 NOT NULL 约束来在 Snowflake 中强制执行 NOT NULL。
主键约束仅供参考;当您将数据插入 table 时,它不会被强制执行。对于主键,您必须删除/删除数据,或者在插入之前必须检查数据是否存在,然后您才可以更新。
根据您的操作,您可以使用以下内容
- 合并(插入和更新)
- 使用 Distinct 检查行是否存在,然后更新或删除旧行并插入新行。
- 您可以使用 ROW_NUMBER 分析函数来识别重复项。
So I'm looking for a easy to write and read (ideally 1-2 lines) piece of code (proably based on some Snowflake function) that throws an error if a set of columns no longer forms a viable primary key in Snowflake SQL
使用 QUALIFY
和窗口化 COUNT 很容易编写这样的测试查询。该模式是将主键列列表放入 PARTITION BY 部分并搜索 non-unique 值,也可以添加额外的空值检查。如果列列表是主键的有效候选者,它不会 return 任何行,如果有违反规则的行,它们将被 returned:
-- checking if PK is applicable
SELECT *
FROM PRIMARY_KEY_TEST
QUALIFY COUNT(*) OVER(PARTITION BY PK) > 1
OR PK IS NULL;
-- chekcing if TEXT column is applicable
SELECT *
FROM PRIMARY_KEY_TEST
QUALIFY COUNT(*) OVER(PARTITION BY TEXT) > 1
OR TEXT IS NULL;
-- chekcing if PK,TEXT columns are applicable
SELECT *
FROM PRIMARY_KEY_TEST
QUALIFY COUNT(*) OVER(PARTITION BY PK,TEXT) > 1
OR PK IS NULL
OR TEXT IS NULL;
I'd still prefer code that can throw an error though
可以使用 Snowflake 脚本和 RAISE 异常:
BEGIN
LET my_exception EXCEPTION (-20002, 'Columns cannot be used as PK.');
IF (EXISTS(SELECT *
FROM PRIMARY_KEY_TEST
QUALIFY COUNT(*) OVER(PARTITION BY PK) > 1
OR PK IS NULL
)) THEN
RAISE my_exception;
END IF;
END;
-20002 (P0001): Uncaught exception of type 'MY_EXCEPTION' on line 8 at position 5 : Columns cannot be used as PK.
臭名昭著的主键约束是 not enforced in snowflake sql:
-- Generating a table with 4 rows that contain duplicates and NULLs:
CREATE OR REPLACE TEMP TABLE PRIMARY_KEY_TEST AS
SELECT
*
FROM (
SELECT 1 AS PK, 'TEST_TEXT' AS TEXT
UNION ALL SELECT 1 AS PK, 'TEST_TEXT' AS TEXT
UNION ALL SELECT NULL AS PK, NULL AS TEXT
UNION ALL SELECT NULL AS PK, NULL AS TEXT
)
;
SELECT *
FROM PRIMARY_KEY_TEST
;
PK | TEXT |
---|---|
1 | TEST_TEXT |
1 | TEST_TEXT |
NULL | NULL |
NULL | NULL |
-- These constraints will NOT throw any errors in Snowflake
ALTER TABLE PRIMARY_KEY_TEST ADD PRIMARY KEY (PK);
ALTER TABLE PRIMARY_KEY_TEST ADD UNIQUE (TEXT);
然而,知道一组列的值对于每一行 uniuqe 并且 never NULL
是至关重要的更新一组数据时检查。
所以我正在寻找一段易于编写和阅读(最好是 1-2 行)的代码(可能基于某些 Snowflake 函数),如果一组列不再构成可行的主列,则会抛出错误键入雪花 SQL.
有什么建议吗?
您可以通过在您不希望为空的列上添加 NOT NULL 约束来在 Snowflake 中强制执行 NOT NULL。
主键约束仅供参考;当您将数据插入 table 时,它不会被强制执行。对于主键,您必须删除/删除数据,或者在插入之前必须检查数据是否存在,然后您才可以更新。 根据您的操作,您可以使用以下内容
- 合并(插入和更新)
- 使用 Distinct 检查行是否存在,然后更新或删除旧行并插入新行。
- 您可以使用 ROW_NUMBER 分析函数来识别重复项。
So I'm looking for a easy to write and read (ideally 1-2 lines) piece of code (proably based on some Snowflake function) that throws an error if a set of columns no longer forms a viable primary key in Snowflake SQL
使用 QUALIFY
和窗口化 COUNT 很容易编写这样的测试查询。该模式是将主键列列表放入 PARTITION BY 部分并搜索 non-unique 值,也可以添加额外的空值检查。如果列列表是主键的有效候选者,它不会 return 任何行,如果有违反规则的行,它们将被 returned:
-- checking if PK is applicable
SELECT *
FROM PRIMARY_KEY_TEST
QUALIFY COUNT(*) OVER(PARTITION BY PK) > 1
OR PK IS NULL;
-- chekcing if TEXT column is applicable
SELECT *
FROM PRIMARY_KEY_TEST
QUALIFY COUNT(*) OVER(PARTITION BY TEXT) > 1
OR TEXT IS NULL;
-- chekcing if PK,TEXT columns are applicable
SELECT *
FROM PRIMARY_KEY_TEST
QUALIFY COUNT(*) OVER(PARTITION BY PK,TEXT) > 1
OR PK IS NULL
OR TEXT IS NULL;
I'd still prefer code that can throw an error though
可以使用 Snowflake 脚本和 RAISE 异常:
BEGIN
LET my_exception EXCEPTION (-20002, 'Columns cannot be used as PK.');
IF (EXISTS(SELECT *
FROM PRIMARY_KEY_TEST
QUALIFY COUNT(*) OVER(PARTITION BY PK) > 1
OR PK IS NULL
)) THEN
RAISE my_exception;
END IF;
END;
-20002 (P0001): Uncaught exception of type 'MY_EXCEPTION' on line 8 at position 5 : Columns cannot be used as PK.