如何每月为每个 id 创建一行?
How to create one row per id per month?
我想每个月为每个 ID 创建一行,直到 end_date
月份。
例如第一个客户 id
从 10 月开始,到 11 月结束。所以我想为客户活跃的每个月获取两行。除此之外,我想创建一个列来标记它在那个月是否处于活动状态。
| id | start_date | end_date |
|----|------------|------------|
| a | 2021-10-02 | 2021-11-15 |
| b | 2021-11-13 | 2021-11-30 |
| c | 2021-11-16 | |
如果没有end_date
,表示它仍然有效,必须到当前月份。
示例数据:
WITH t1 AS (
SELECT 'a' AS id, '2021-10-02'::date AS start_date, '2021-11-15'::date AS end_date UNION ALL
SELECT 'b' AS id, '2021-11-13'::date AS start_date, '2021-11-30'::date AS end_date UNION ALL
SELECT 'c' AS id, '2021-11-16'::date AS start_date, NULL::date AS end_date
)
预期结果:
| id | start_date | end_date | months | is_active |
|----|------------|------------|------------|-----------|
| a | 2021-10-02 | 2021-11-15 | 2021-10-01 | TRUE |
| a | 2021-10-02 | 2021-11-15 | 2021-11-01 | FALSE |
| b | 2021-11-13 | 2021-11-30 | 2021-11-01 | FALSE |
| c | 2021-11-16 | | 2021-11-01 | TRUE |
| c | 2021-11-16 | | 2021-12-01 | TRUE |
| c | 2021-11-16 | | 2022-01-01 | TRUE |
如何在 Snowflake 中实现这一点?
所以如果你有一个范围,你将需要一些跨越时间的东西来加入反对,这是可以使用 generator 的地方,我将把它放入 CTE 中。我还将使用 ROW_NUMBER() 生成月份步骤的序列,以确保没有间隙。 200
需要进行硬编码,因此请输入一个足以满足您需要的数据的值,或者将其弹出到 table.
WITH months AS (
SELECT
ROW_NUMBER() OVER (ORDER BY NULL) - 1 as rn
FROM TABLE(generator(rowcount => 200))
)
接下来我们要截断 start_date
并找到 end_date 之后的月数,并将其加入我们的范围
), range_prep AS (
SELECT id,
start_date,
end_date,
date_trunc(month, start_date) as start_month,
datediff(month, start_month, coalesce(end_date, CURRENT_DATE())) as month_count
FROM data
)
将它们结合起来,然后做:
SELECT id,
r.start_date,
r.end_date,
dateadd(month, m.rn, r.start_month) as months,
(r.end_date is null) OR (date_trunc(month, r.end_date) > months) AS is_active
FROM range_prep as r
JOIN months as m
ON m.rn <= r.month_count
ORDER BY 1,2;
将所有内容与 data
的 CTE 放在一起,我们有:
WITH data AS (
SELECT id,
to_date(start_date) as start_date,
to_date(end_date) as end_date
FROM VALUES
('a','2021-10-02','2021-11-15'),
('b','2021-11-13','2021-11-30'),
('c','2021-11-16',null)
v( id, start_date, end_date)
), months AS (
SELECT
ROW_NUMBER() OVER (ORDER BY NULL) - 1 as rn
FROM TABLE(generator(rowcount => 200))
), range_prep AS (
SELECT id,
start_date,
end_date,
date_trunc(month, start_date) as start_month,
datediff(month, start_month, coalesce(end_date, CURRENT_DATE())) as month_count
FROM data
)
SELECT id,
r.start_date,
r.end_date,
dateadd(month, m.rn, r.start_month) as months,
(r.end_date is null) OR (date_trunc(month, r.end_date) > months) AS is_active
FROM range_prep as r
JOIN months as m
ON m.rn <= r.month_count
ORDER BY 1,2;
给出:
ID
START_DATE
END_DATE
MONTHS
IS_ACTIVE
a
2021-10-02
2021-11-15
2021-10-01
TRUE
a
2021-10-02
2021-11-15
2021-11-01
FALSE
b
2021-11-13
2021-11-30
2021-11-01
FALSE
c
2021-11-16
2021-11-01
TRUE
c
2021-11-16
2021-12-01
TRUE
c
2021-11-16
2022-01-01
TRUE
我想每个月为每个 ID 创建一行,直到 end_date
月份。
例如第一个客户 id
从 10 月开始,到 11 月结束。所以我想为客户活跃的每个月获取两行。除此之外,我想创建一个列来标记它在那个月是否处于活动状态。
| id | start_date | end_date |
|----|------------|------------|
| a | 2021-10-02 | 2021-11-15 |
| b | 2021-11-13 | 2021-11-30 |
| c | 2021-11-16 | |
如果没有end_date
,表示它仍然有效,必须到当前月份。
示例数据:
WITH t1 AS (
SELECT 'a' AS id, '2021-10-02'::date AS start_date, '2021-11-15'::date AS end_date UNION ALL
SELECT 'b' AS id, '2021-11-13'::date AS start_date, '2021-11-30'::date AS end_date UNION ALL
SELECT 'c' AS id, '2021-11-16'::date AS start_date, NULL::date AS end_date
)
预期结果:
| id | start_date | end_date | months | is_active |
|----|------------|------------|------------|-----------|
| a | 2021-10-02 | 2021-11-15 | 2021-10-01 | TRUE |
| a | 2021-10-02 | 2021-11-15 | 2021-11-01 | FALSE |
| b | 2021-11-13 | 2021-11-30 | 2021-11-01 | FALSE |
| c | 2021-11-16 | | 2021-11-01 | TRUE |
| c | 2021-11-16 | | 2021-12-01 | TRUE |
| c | 2021-11-16 | | 2022-01-01 | TRUE |
如何在 Snowflake 中实现这一点?
所以如果你有一个范围,你将需要一些跨越时间的东西来加入反对,这是可以使用 generator 的地方,我将把它放入 CTE 中。我还将使用 ROW_NUMBER() 生成月份步骤的序列,以确保没有间隙。 200
需要进行硬编码,因此请输入一个足以满足您需要的数据的值,或者将其弹出到 table.
WITH months AS (
SELECT
ROW_NUMBER() OVER (ORDER BY NULL) - 1 as rn
FROM TABLE(generator(rowcount => 200))
)
接下来我们要截断 start_date
并找到 end_date 之后的月数,并将其加入我们的范围
), range_prep AS (
SELECT id,
start_date,
end_date,
date_trunc(month, start_date) as start_month,
datediff(month, start_month, coalesce(end_date, CURRENT_DATE())) as month_count
FROM data
)
将它们结合起来,然后做:
SELECT id,
r.start_date,
r.end_date,
dateadd(month, m.rn, r.start_month) as months,
(r.end_date is null) OR (date_trunc(month, r.end_date) > months) AS is_active
FROM range_prep as r
JOIN months as m
ON m.rn <= r.month_count
ORDER BY 1,2;
将所有内容与 data
的 CTE 放在一起,我们有:
WITH data AS (
SELECT id,
to_date(start_date) as start_date,
to_date(end_date) as end_date
FROM VALUES
('a','2021-10-02','2021-11-15'),
('b','2021-11-13','2021-11-30'),
('c','2021-11-16',null)
v( id, start_date, end_date)
), months AS (
SELECT
ROW_NUMBER() OVER (ORDER BY NULL) - 1 as rn
FROM TABLE(generator(rowcount => 200))
), range_prep AS (
SELECT id,
start_date,
end_date,
date_trunc(month, start_date) as start_month,
datediff(month, start_month, coalesce(end_date, CURRENT_DATE())) as month_count
FROM data
)
SELECT id,
r.start_date,
r.end_date,
dateadd(month, m.rn, r.start_month) as months,
(r.end_date is null) OR (date_trunc(month, r.end_date) > months) AS is_active
FROM range_prep as r
JOIN months as m
ON m.rn <= r.month_count
ORDER BY 1,2;
给出:
ID | START_DATE | END_DATE | MONTHS | IS_ACTIVE |
---|---|---|---|---|
a | 2021-10-02 | 2021-11-15 | 2021-10-01 | TRUE |
a | 2021-10-02 | 2021-11-15 | 2021-11-01 | FALSE |
b | 2021-11-13 | 2021-11-30 | 2021-11-01 | FALSE |
c | 2021-11-16 | 2021-11-01 | TRUE | |
c | 2021-11-16 | 2021-12-01 | TRUE | |
c | 2021-11-16 | 2022-01-01 | TRUE |