使用 Left Join 条件在 Snowflake 中创建递归查询?
Create recursive query in Snowflake with a Left Join condition?
我正在尝试创建一个依赖于 LEFT JOIN
条件的递归查询,但我不确定是否可行,尤其是在 Snowflake 中。
我有三个 table:ITEM
、ITEMHIERARCHY
和 ITEMVALUE
CREATE TABLE ITEM
(
NAME STRING
);
INSERT INTO ITEM(NAME)
VALUES
('Item1'),('Item2'),('Item3'),('Item4'),('Item5'),('Item6');
CREATE TABLE ITEMHIERARCHY
(
ITEM STRING,
SUBITEM STRING
);
INSERT INTO ITEMHIERARCHY(ITEM,SUBITEM)
VALUES
('Item2','Item3'),('Item2','Item4'),('Item4','Item5'),('Item6','Item4');
CREATE TABLE ITEMVALUE
(
ITEM STRING,
VALUE NUMERIC(25,10)
);
INSERT INTO ITEMVALUE(ITEM,VALUE)
VALUES
('Item1',34.2),('Item3',40.5),('Item5',20.3),('Item6',77.7);
我的目标是 return 列出所有 ITEMs
的值和子项目值汇总:
Item1, 34.2
Item2, 60.8 //roll-up of Item3 + Item4
Item3, 40.5
Item4, 20.3 //roll-up of Item5
Item5, 20.3
Item6, 77.7 //since Item6 value is given, dont roll-up from Item4
请注意,即使 Item6
是 Item4
的汇总,因为 ITEMVALUE
table 上已经有给定的 77.7
值,汇总将被忽略。
由于 UNION ALL
子句中的 LEFT JOIN
,我尝试递归查询失败:
WITH RECURSIVE ITEMHIERARCHYFULL
-- Column names for the "view"/CTE
(ITEM,SUBITEM,VALUE)
AS
-- Common Table Expression
(
-- Anchor Clause
SELECT it.NAME ITEM, ih.SUBITEM, iv.VALUE
FROM ITEM it
--These left-joins work
LEFT JOIN ITEMVALUE iv ON iv.ITEM = it.NAME
LEFT JOIN ITEMHIERARCHY ih ON ih.ITEM = it.ITEM
AND iv.VALUE IS NULL
UNION ALL
-- Recursive Clause
SELECT ihf.ITEM, ih.SUBITEM,
IFF(ihf.VALUE IS NOT NULL,ihf.VALUE,iv.VALUE)
FROM ITEMHIERARCHYFULL ihf
LEFT JOIN ITEMVALUE iv ON iv.ITEM = ihf.SUBITEM
LEFT JOIN ITEMHIERARCHY ih ON ih.ITEM = ihf.SUBITEM
AND iv.VALUE IS NULL
)
-- This is the "main select".
SELECT ITEM, SUM(VALUE) AS VALUE
FROM ITEMHIERARCHYFULL
GROUP BY ITEM
ORDER BY ITEM
;
查询的目标是首先从ITEM
table中获取所有top levelITEMs
,在ITEMVALUE
[=60]上搜索对应的值=],并且,如果找到 none,则加入 ITEMHIERARCHY
Table 以检索构成顶级 ITEMs
的所有 SUBITEMs
。然后我想在 ITEMVALUE
table 上递归搜索 SUBITEM-VALUE
匹配项,或者,如果找到 none,则从 ITEMHIERARCHY
table.
第一组 LEFT-JOINs
有效,但 UNION ALL
下的无效,给我错误:
SQL compilation error: OUTER JOINs with a self reference are not allowed in a recursive CTE.
是否有更好的方法来完成我在 Snowflake
中尝试做的事情,或者我没有正确考虑这个问题?
目前我手动将递归层写到 5 个级别,这意味着如果 ITEMHIERARCHY
table 变得更复杂,我必须添加一个级别。
这是一个可以为您提供预期结果的工作示例。您也可以在 SQLFiddle.
上查看
WITH CTE AS
(
SELECT
i.NAME
, IH.SUBITEM AS descendant
, CASE WHEN IV.VALUE IS NULL THEN 1 ELSE 0 END AS LEVEL
FROM ITEM AS i
LEFT JOIN ITEMHIERARCHY AS IH
ON i.NAME = IH.ITEM
LEFT JOIN ITEMVALUE AS IV
ON I.NAME = IV.ITEM
UNION ALL
SELECT
CTE.NAME
, sIH.SUBITEM
, 1 AS LEVEL
FROM CTE
INNER JOIN ITEM AS si
ON CTE.descendant = si.NAME
INNER JOIN ITEMHIERARCHY AS sIH
ON si.NAME = sIH.ITEM
), CTE2 AS
(
SELECT
CTE.NAME
, LEVEL
, SUM(IV.VALUE) AS VALUE
, ROW_NUMBER()OVER(PARTITION BY CTE.NAME ORDER BY CTE.LEVEL ASC) AS RNK
FROM CTE
LEFT JOIN ITEMVALUE AS IV
ON (CTE.LEVEL=0 AND CTE.NAME = IV.ITEM)
OR (CTE.LEVEL <> 0 AND CTE.descendant = IV.ITEM)
GROUP BY CTE.NAME, CTE.LEVEL
)
SELECT
NAME
, VALUE
FROM CTE2
WHERE RNK = 1
ORDER BY
NAME
;
结果:
NAME VALUE
Item1 34.2000000000
Item2 60.8000000000
Item3 40.5000000000
Item4 20.3000000000
Item5 20.3000000000
Item6 77.7000000000
这里有一个栈溢出问题,为什么递归查询中不允许LEFT JOINs
:link,基本上是为了防止∞ recursion
,有点弱原因我。在第二个回复中还建议,如果您的 SQL 方言支持 OUTER APPLY
您可以使用它来代替功能等效,但 Snowflake 没有该功能。
这是我的最多 3 级层次结构的手动“递归”解决方案:
SELECT rec.ITEM,
SUM(CASE
WHEN rec.VALUE1 IS NOT NULL THEN rec.VALUE1
WHEN rec.VALUE2 IS NOT NULL THEN rec.VALUE2
ELSE rec.VALUE3
END) VALUE
FROM (
SELECT it.NAME ITEM,
ih1.SUBITEM SUBITEM1, CASE
WHEN iv1.VALUE IS NOT NULL THEN iv1.Value
ELSE iv1s.Value
END Value1,
ih2.SUBITEM SUBITEM2, CASE
WHEN iv2.VALUE IS NOT NULL THEN iv2.Value
ELSE iv2s.Value
END Value2,
ih3.SUBITEM SUBITEM3, CASE
WHEN iv3.VALUE IS NOT NULL THEN iv3.Value
ELSE iv3s.Value
END Value3
FROM ITEM it
LEFT JOIN ITEMVALUE iv1 ON iv1.ITEM = it.NAME
LEFT JOIN ITEMHIERARCHY ih1 ON ih1.ITEM = it.NAME
AND iv1.VALUE IS NULL
LEFT JOIN ITEMVALUE iv1s ON iv1s.ITEM = ih1.SUBITEM
LEFT JOIN ITEMVALUE iv2 ON iv2.ITEM = ih1.SUBITEM
LEFT JOIN ITEMHIERARCHY ih2 ON ih2.ITEM = ih1.SUBITEM
AND iv1.VALUE IS NULL
AND iv1s.VALUE IS NULL
AND iv2.VALUE IS NULL
LEFT JOIN ITEMVALUE iv2s ON iv2s.ITEM = ih2.SUBITEM
LEFT JOIN ITEMVALUE iv3 ON iv3.ITEM = ih2.SUBITEM
LEFT JOIN ITEMHIERARCHY ih3 ON ih3.ITEM = ih2.SUBITEM
AND iv1.VALUE IS NULL
AND iv1s.VALUE IS NULL
AND iv2.VALUE IS NULL
AND iv2s.VALUE IS NULL
AND iv3.VALUE IS NULL
LEFT JOIN ITEMVALUE iv3s ON iv3s.ITEM = ih3.SUBITEM
) rec
WHERE CASE
WHEN VALUE1 IS NOT NULL THEN VALUE1
WHEN VALUE2 IS NOT NULL THEN VALUE2
ELSE VALUE3
END IS NOT NULL
GROUP BY ITEM
这显然是一种语法上非常低效的方法,在每个步骤中,您都必须检查 ITEM
和 SUBITEM
值,然后重复 NULL
检查之前的每个 ITEMVALUE
或 SUBITEMVALUE
table。我为每个级别添加了 SUBITEMs
,因此如果您的 运行 只是查询的内部部分,您可以看到扩展是如何工作的。我还必须使用 CASE
语句来使 SQLFIDDLE 正常工作,但我更愿意使用 IFF
和
IFNULL(Value1,IFNULL(Value2,Value3))
.
这是 SQL Fiddle 上的工作代码:link 和输出:
Item1, 34.2
Item2, 60.8
Item3, 40.5
Item4, 20.3
Item5, 20.3
Item6, 77.7
我正在尝试创建一个依赖于 LEFT JOIN
条件的递归查询,但我不确定是否可行,尤其是在 Snowflake 中。
我有三个 table:ITEM
、ITEMHIERARCHY
和 ITEMVALUE
CREATE TABLE ITEM
(
NAME STRING
);
INSERT INTO ITEM(NAME)
VALUES
('Item1'),('Item2'),('Item3'),('Item4'),('Item5'),('Item6');
CREATE TABLE ITEMHIERARCHY
(
ITEM STRING,
SUBITEM STRING
);
INSERT INTO ITEMHIERARCHY(ITEM,SUBITEM)
VALUES
('Item2','Item3'),('Item2','Item4'),('Item4','Item5'),('Item6','Item4');
CREATE TABLE ITEMVALUE
(
ITEM STRING,
VALUE NUMERIC(25,10)
);
INSERT INTO ITEMVALUE(ITEM,VALUE)
VALUES
('Item1',34.2),('Item3',40.5),('Item5',20.3),('Item6',77.7);
我的目标是 return 列出所有 ITEMs
的值和子项目值汇总:
Item1, 34.2
Item2, 60.8 //roll-up of Item3 + Item4
Item3, 40.5
Item4, 20.3 //roll-up of Item5
Item5, 20.3
Item6, 77.7 //since Item6 value is given, dont roll-up from Item4
请注意,即使 Item6
是 Item4
的汇总,因为 ITEMVALUE
table 上已经有给定的 77.7
值,汇总将被忽略。
由于 UNION ALL
子句中的 LEFT JOIN
,我尝试递归查询失败:
WITH RECURSIVE ITEMHIERARCHYFULL
-- Column names for the "view"/CTE
(ITEM,SUBITEM,VALUE)
AS
-- Common Table Expression
(
-- Anchor Clause
SELECT it.NAME ITEM, ih.SUBITEM, iv.VALUE
FROM ITEM it
--These left-joins work
LEFT JOIN ITEMVALUE iv ON iv.ITEM = it.NAME
LEFT JOIN ITEMHIERARCHY ih ON ih.ITEM = it.ITEM
AND iv.VALUE IS NULL
UNION ALL
-- Recursive Clause
SELECT ihf.ITEM, ih.SUBITEM,
IFF(ihf.VALUE IS NOT NULL,ihf.VALUE,iv.VALUE)
FROM ITEMHIERARCHYFULL ihf
LEFT JOIN ITEMVALUE iv ON iv.ITEM = ihf.SUBITEM
LEFT JOIN ITEMHIERARCHY ih ON ih.ITEM = ihf.SUBITEM
AND iv.VALUE IS NULL
)
-- This is the "main select".
SELECT ITEM, SUM(VALUE) AS VALUE
FROM ITEMHIERARCHYFULL
GROUP BY ITEM
ORDER BY ITEM
;
查询的目标是首先从ITEM
table中获取所有top levelITEMs
,在ITEMVALUE
[=60]上搜索对应的值=],并且,如果找到 none,则加入 ITEMHIERARCHY
Table 以检索构成顶级 ITEMs
的所有 SUBITEMs
。然后我想在 ITEMVALUE
table 上递归搜索 SUBITEM-VALUE
匹配项,或者,如果找到 none,则从 ITEMHIERARCHY
table.
第一组 LEFT-JOINs
有效,但 UNION ALL
下的无效,给我错误:
SQL compilation error: OUTER JOINs with a self reference are not allowed in a recursive CTE.
是否有更好的方法来完成我在 Snowflake
中尝试做的事情,或者我没有正确考虑这个问题?
目前我手动将递归层写到 5 个级别,这意味着如果 ITEMHIERARCHY
table 变得更复杂,我必须添加一个级别。
这是一个可以为您提供预期结果的工作示例。您也可以在 SQLFiddle.
上查看WITH CTE AS
(
SELECT
i.NAME
, IH.SUBITEM AS descendant
, CASE WHEN IV.VALUE IS NULL THEN 1 ELSE 0 END AS LEVEL
FROM ITEM AS i
LEFT JOIN ITEMHIERARCHY AS IH
ON i.NAME = IH.ITEM
LEFT JOIN ITEMVALUE AS IV
ON I.NAME = IV.ITEM
UNION ALL
SELECT
CTE.NAME
, sIH.SUBITEM
, 1 AS LEVEL
FROM CTE
INNER JOIN ITEM AS si
ON CTE.descendant = si.NAME
INNER JOIN ITEMHIERARCHY AS sIH
ON si.NAME = sIH.ITEM
), CTE2 AS
(
SELECT
CTE.NAME
, LEVEL
, SUM(IV.VALUE) AS VALUE
, ROW_NUMBER()OVER(PARTITION BY CTE.NAME ORDER BY CTE.LEVEL ASC) AS RNK
FROM CTE
LEFT JOIN ITEMVALUE AS IV
ON (CTE.LEVEL=0 AND CTE.NAME = IV.ITEM)
OR (CTE.LEVEL <> 0 AND CTE.descendant = IV.ITEM)
GROUP BY CTE.NAME, CTE.LEVEL
)
SELECT
NAME
, VALUE
FROM CTE2
WHERE RNK = 1
ORDER BY
NAME
;
结果:
NAME VALUE
Item1 34.2000000000
Item2 60.8000000000
Item3 40.5000000000
Item4 20.3000000000
Item5 20.3000000000
Item6 77.7000000000
这里有一个栈溢出问题,为什么递归查询中不允许LEFT JOINs
:link,基本上是为了防止∞ recursion
,有点弱原因我。在第二个回复中还建议,如果您的 SQL 方言支持 OUTER APPLY
您可以使用它来代替功能等效,但 Snowflake 没有该功能。
这是我的最多 3 级层次结构的手动“递归”解决方案:
SELECT rec.ITEM,
SUM(CASE
WHEN rec.VALUE1 IS NOT NULL THEN rec.VALUE1
WHEN rec.VALUE2 IS NOT NULL THEN rec.VALUE2
ELSE rec.VALUE3
END) VALUE
FROM (
SELECT it.NAME ITEM,
ih1.SUBITEM SUBITEM1, CASE
WHEN iv1.VALUE IS NOT NULL THEN iv1.Value
ELSE iv1s.Value
END Value1,
ih2.SUBITEM SUBITEM2, CASE
WHEN iv2.VALUE IS NOT NULL THEN iv2.Value
ELSE iv2s.Value
END Value2,
ih3.SUBITEM SUBITEM3, CASE
WHEN iv3.VALUE IS NOT NULL THEN iv3.Value
ELSE iv3s.Value
END Value3
FROM ITEM it
LEFT JOIN ITEMVALUE iv1 ON iv1.ITEM = it.NAME
LEFT JOIN ITEMHIERARCHY ih1 ON ih1.ITEM = it.NAME
AND iv1.VALUE IS NULL
LEFT JOIN ITEMVALUE iv1s ON iv1s.ITEM = ih1.SUBITEM
LEFT JOIN ITEMVALUE iv2 ON iv2.ITEM = ih1.SUBITEM
LEFT JOIN ITEMHIERARCHY ih2 ON ih2.ITEM = ih1.SUBITEM
AND iv1.VALUE IS NULL
AND iv1s.VALUE IS NULL
AND iv2.VALUE IS NULL
LEFT JOIN ITEMVALUE iv2s ON iv2s.ITEM = ih2.SUBITEM
LEFT JOIN ITEMVALUE iv3 ON iv3.ITEM = ih2.SUBITEM
LEFT JOIN ITEMHIERARCHY ih3 ON ih3.ITEM = ih2.SUBITEM
AND iv1.VALUE IS NULL
AND iv1s.VALUE IS NULL
AND iv2.VALUE IS NULL
AND iv2s.VALUE IS NULL
AND iv3.VALUE IS NULL
LEFT JOIN ITEMVALUE iv3s ON iv3s.ITEM = ih3.SUBITEM
) rec
WHERE CASE
WHEN VALUE1 IS NOT NULL THEN VALUE1
WHEN VALUE2 IS NOT NULL THEN VALUE2
ELSE VALUE3
END IS NOT NULL
GROUP BY ITEM
这显然是一种语法上非常低效的方法,在每个步骤中,您都必须检查 ITEM
和 SUBITEM
值,然后重复 NULL
检查之前的每个 ITEMVALUE
或 SUBITEMVALUE
table。我为每个级别添加了 SUBITEMs
,因此如果您的 运行 只是查询的内部部分,您可以看到扩展是如何工作的。我还必须使用 CASE
语句来使 SQLFIDDLE 正常工作,但我更愿意使用 IFF
和
IFNULL(Value1,IFNULL(Value2,Value3))
.
这是 SQL Fiddle 上的工作代码:link 和输出:
Item1, 34.2
Item2, 60.8
Item3, 40.5
Item4, 20.3
Item5, 20.3
Item6, 77.7