Valid_from Valid_to 来自满载 table
Valid_from Valid_to from a full loaded table
有一个来源 table 可以按月完整加载数据。 table 如下例所示。
来源table:
pk
code_paym
code_terms
etl_id
1
2
3
2020-08-01
1
2
3
2020-09-01
1
2
4
2020-10-01
1
2
4
2020-11-01
1
2
4
2020-12-01
1
2
4
2021-01-01
1
2
3
2021-02-01
1
2
3
2021-03-01
1
2
3
2021-04-01
1
2
3
2021-05-01
我想从源 table 创建 valid_from valid_to 列,如下例所示。
期望的输出:
pk
code_paym
code_terms
valid_from
valid_to
1
2
3
2020-08-01
2020-09-01
1
2
4
2020-10-01
2021-01-01
1
2
3
2021-02-01
2021-05-01
可以看出,到时候属性可以恢复到相同的值。
如何通过 sql 代码实现此输出?
非常感谢,
问候
使用CONDITIONAL_TRUE_EVENT窗函数确定连续子组:
CREATE OR REPLACE TABLE t( pk INT, code_paym INT, code_terms INT, etl_id DATE)
AS
SELECT 1, 2, 3, '2020-08-01'
UNION ALL SELECT 1, 2, 3, '2020-09-01'
UNION ALL SELECT 1, 2, 4, '2020-10-01'
UNION ALL SELECT 1, 2, 4, '2020-11-01'
UNION ALL SELECT 1, 2, 4, '2020-12-01'
UNION ALL SELECT 1, 2, 4, '2021-01-01'
UNION ALL SELECT 1, 2, 3, '2021-02-01'
UNION ALL SELECT 1, 2, 3, '2021-03-01'
UNION ALL SELECT 1, 2, 3, '2021-04-01'
UNION ALL SELECT 1, 2, 3, '2021-05-01';
查询:
WITH cte AS (
SELECT t.*,
CONDITIONAL_TRUE_EVENT(CODE_TERMS != LAG(CODE_TERMS,1,CODE_TERMS)
OVER(PARTITION BY PK, CODE_PAYM ORDER BY ETL_ID))
OVER(PARTITION BY PK, CODE_PAYM ORDER BY ETL_ID) AS grp
FROM t
)
SELECT PK, CODE_PAYM, grp, MIN(ETL_ID) AS valid_from, MAX(ETL_ID) AS valid_to
FROM cte
GROUP BY PK, CODE_PAYM, grp;
输出:
有一个来源 table 可以按月完整加载数据。 table 如下例所示。
来源table:
pk | code_paym | code_terms | etl_id |
---|---|---|---|
1 | 2 | 3 | 2020-08-01 |
1 | 2 | 3 | 2020-09-01 |
1 | 2 | 4 | 2020-10-01 |
1 | 2 | 4 | 2020-11-01 |
1 | 2 | 4 | 2020-12-01 |
1 | 2 | 4 | 2021-01-01 |
1 | 2 | 3 | 2021-02-01 |
1 | 2 | 3 | 2021-03-01 |
1 | 2 | 3 | 2021-04-01 |
1 | 2 | 3 | 2021-05-01 |
我想从源 table 创建 valid_from valid_to 列,如下例所示。
期望的输出:
pk | code_paym | code_terms | valid_from | valid_to |
---|---|---|---|---|
1 | 2 | 3 | 2020-08-01 | 2020-09-01 |
1 | 2 | 4 | 2020-10-01 | 2021-01-01 |
1 | 2 | 3 | 2021-02-01 | 2021-05-01 |
可以看出,到时候属性可以恢复到相同的值。 如何通过 sql 代码实现此输出?
非常感谢, 问候
使用CONDITIONAL_TRUE_EVENT窗函数确定连续子组:
CREATE OR REPLACE TABLE t( pk INT, code_paym INT, code_terms INT, etl_id DATE)
AS
SELECT 1, 2, 3, '2020-08-01'
UNION ALL SELECT 1, 2, 3, '2020-09-01'
UNION ALL SELECT 1, 2, 4, '2020-10-01'
UNION ALL SELECT 1, 2, 4, '2020-11-01'
UNION ALL SELECT 1, 2, 4, '2020-12-01'
UNION ALL SELECT 1, 2, 4, '2021-01-01'
UNION ALL SELECT 1, 2, 3, '2021-02-01'
UNION ALL SELECT 1, 2, 3, '2021-03-01'
UNION ALL SELECT 1, 2, 3, '2021-04-01'
UNION ALL SELECT 1, 2, 3, '2021-05-01';
查询:
WITH cte AS (
SELECT t.*,
CONDITIONAL_TRUE_EVENT(CODE_TERMS != LAG(CODE_TERMS,1,CODE_TERMS)
OVER(PARTITION BY PK, CODE_PAYM ORDER BY ETL_ID))
OVER(PARTITION BY PK, CODE_PAYM ORDER BY ETL_ID) AS grp
FROM t
)
SELECT PK, CODE_PAYM, grp, MIN(ETL_ID) AS valid_from, MAX(ETL_ID) AS valid_to
FROM cte
GROUP BY PK, CODE_PAYM, grp;
输出: