替换每个分区的 NULL 值
Replace NULL values per partition
我想为每个 session_id
列的 NULL
值填充一个关联的非空值。我怎样才能做到这一点?
示例数据如下:
+------------+-------+---------+
| session_id | step | device |
+------------+-------+---------+
| 351acc | step1 | |
| 351acc | step2 | |
| 351acc | step3 | mobile |
| 351acc | step4 | mobile |
| 350bca | step1 | desktop |
| 350bca | step2 | |
| 350bca | step3 | |
| 350bca | step4 | desktop |
+------------+-------+---------+
期望输出:
+------------+-------+---------+
| session_id | step | device |
+------------+-------+---------+
| 351acc | step1 | mobile |
| 351acc | step2 | mobile |
| 351acc | step3 | mobile |
| 351acc | step4 | mobile |
| 350bca | step1 | desktop |
| 350bca | step2 | desktop |
| 350bca | step3 | desktop |
| 350bca | step4 | desktop |
+------------+-------+---------+
select session_id, step,coalesce(device, max(device) over (partition by session_id order by step desc)) device
from table
根据您的数据样本,每个会话有一个设备,因此您可以添加一个子查询以从其他行获取值
WITH j (session_id, step, device) AS (
VALUES ('351acc','step1',NULL),
('351acc','step2',NULL),
('351acc','step3','mobile'),
('351acc','step4','mobile'),
('350bca','step1','desktop'),
('350bca','step2',NULL),
('350bca','step3',NULL),
('350bca','step4','desktop')
)
SELECT session_id,step,
(SELECT DISTINCT device
FROM j q2
WHERE q2.session_id = q1.session_id AND q2.device IS NOT NULL) AS device
FROM j q1 ORDER BY session_id,step;
session_id | step | device
------------+-------+---------
350bca | step1 | desktop
350bca | step2 | desktop
350bca | step3 | desktop
350bca | step4 | desktop
351acc | step1 | mobile
351acc | step2 | mobile
351acc | step3 | mobile
351acc | step4 | mobile
(8 Zeilen)
演示:db<>fiddle
顺序正确的window function first_value()
可能是最便宜的:
SELECT session_id, step
, COALESCE(device
, first_value(device) OVER (PARTITION BY session_id ORDER BY device IS NULL, step)
) AS device
FROM tbl
ORDER BY session_id DESC, step;
db<>fiddle here
ORDER BY device IS NULL, step
最后对 NULL
值进行排序,因此选择最早的具有非空值的 step
。参见:
- Sorting null values after all others, except special
如果每个 session_id
的非空设备始终相同,您可以简化为 ORDER BY device IS NULL
。而且你不需要 COALESCE
.
我想为每个 session_id
列的 NULL
值填充一个关联的非空值。我怎样才能做到这一点?
示例数据如下:
+------------+-------+---------+
| session_id | step | device |
+------------+-------+---------+
| 351acc | step1 | |
| 351acc | step2 | |
| 351acc | step3 | mobile |
| 351acc | step4 | mobile |
| 350bca | step1 | desktop |
| 350bca | step2 | |
| 350bca | step3 | |
| 350bca | step4 | desktop |
+------------+-------+---------+
期望输出:
+------------+-------+---------+
| session_id | step | device |
+------------+-------+---------+
| 351acc | step1 | mobile |
| 351acc | step2 | mobile |
| 351acc | step3 | mobile |
| 351acc | step4 | mobile |
| 350bca | step1 | desktop |
| 350bca | step2 | desktop |
| 350bca | step3 | desktop |
| 350bca | step4 | desktop |
+------------+-------+---------+
select session_id, step,coalesce(device, max(device) over (partition by session_id order by step desc)) device
from table
根据您的数据样本,每个会话有一个设备,因此您可以添加一个子查询以从其他行获取值
WITH j (session_id, step, device) AS (
VALUES ('351acc','step1',NULL),
('351acc','step2',NULL),
('351acc','step3','mobile'),
('351acc','step4','mobile'),
('350bca','step1','desktop'),
('350bca','step2',NULL),
('350bca','step3',NULL),
('350bca','step4','desktop')
)
SELECT session_id,step,
(SELECT DISTINCT device
FROM j q2
WHERE q2.session_id = q1.session_id AND q2.device IS NOT NULL) AS device
FROM j q1 ORDER BY session_id,step;
session_id | step | device
------------+-------+---------
350bca | step1 | desktop
350bca | step2 | desktop
350bca | step3 | desktop
350bca | step4 | desktop
351acc | step1 | mobile
351acc | step2 | mobile
351acc | step3 | mobile
351acc | step4 | mobile
(8 Zeilen)
演示:db<>fiddle
顺序正确的window function first_value()
可能是最便宜的:
SELECT session_id, step
, COALESCE(device
, first_value(device) OVER (PARTITION BY session_id ORDER BY device IS NULL, step)
) AS device
FROM tbl
ORDER BY session_id DESC, step;
db<>fiddle here
ORDER BY device IS NULL, step
最后对 NULL
值进行排序,因此选择最早的具有非空值的 step
。参见:
- Sorting null values after all others, except special
如果每个 session_id
的非空设备始终相同,您可以简化为 ORDER BY device IS NULL
。而且你不需要 COALESCE
.