分区依据,密集等级
Partition by, dense rank
我有以下 table 描述患者就诊的内容:每个患者都有一个 visit_id 他会去看特定医生。我正在尝试提取看到他的第三位医生的 visit_id 的价值。 (第 3 位医生而不是第 3 次就诊)
patient
visit_id
physician
a
1
id_1
a
2
id_2
a
3
id_1
a
4
id_3
b
5
id_1
b
6
id_2
c
7
id_1
c
8
id_2
c
9
id_3
所以结果将是:
patient
visit_id
a
4
c
9
有什么建议吗?
下面声明returns你的结果。最里面的子查询消除了同一位医生的多次就诊,然后 row_number()
计算就诊次数,最外面的 select
获取第三位医生。
select patient, visit
from (select patient, visit, row_number() over (partition by patient order by visit) rn
from ( select patient, min(visit) as visit
from tab
group by patient, physician
) t1
) t2
where t2.rn = 3
结果:
patient
visit_id
a
4
c
9
您可以按 patient
和 physician
分组以删除“重复”医生并使用 min
用于 visit_id
:
-- test data
WITH dataset (patient, visit_id, physician) AS (
VALUES ('a', 1, 'id_1'),
('a', 2, 'id_2'),
('a', 3, 'id_1'),
('a', 4, 'id_3'),
('b', 5, 'id_1'),
('b', 6, 'id_2'),
('c', 7, 'id_1'),
('c', 8, 'id_2'),
('c', 9, 'id_3')
)
-- query
select patient, visit_id
from (
select *,
row_number() over (partition by patient order by visit_id) rnk
from (
select patient,
min(visit_id) visit_id,
physician
from dataset
group by patient, physician
)
)
where rnk = 3
输出:
patient
visit_id
a
4
c
9
请注意,此查询使用了 presto 语法(因为您的问题具有 presto 标记)。
我有以下 table 描述患者就诊的内容:每个患者都有一个 visit_id 他会去看特定医生。我正在尝试提取看到他的第三位医生的 visit_id 的价值。 (第 3 位医生而不是第 3 次就诊)
patient | visit_id | physician |
---|---|---|
a | 1 | id_1 |
a | 2 | id_2 |
a | 3 | id_1 |
a | 4 | id_3 |
b | 5 | id_1 |
b | 6 | id_2 |
c | 7 | id_1 |
c | 8 | id_2 |
c | 9 | id_3 |
所以结果将是:
patient | visit_id |
---|---|
a | 4 |
c | 9 |
有什么建议吗?
下面声明returns你的结果。最里面的子查询消除了同一位医生的多次就诊,然后 row_number()
计算就诊次数,最外面的 select
获取第三位医生。
select patient, visit
from (select patient, visit, row_number() over (partition by patient order by visit) rn
from ( select patient, min(visit) as visit
from tab
group by patient, physician
) t1
) t2
where t2.rn = 3
结果:
patient | visit_id |
---|---|
a | 4 |
c | 9 |
您可以按 patient
和 physician
分组以删除“重复”医生并使用 min
用于 visit_id
:
-- test data
WITH dataset (patient, visit_id, physician) AS (
VALUES ('a', 1, 'id_1'),
('a', 2, 'id_2'),
('a', 3, 'id_1'),
('a', 4, 'id_3'),
('b', 5, 'id_1'),
('b', 6, 'id_2'),
('c', 7, 'id_1'),
('c', 8, 'id_2'),
('c', 9, 'id_3')
)
-- query
select patient, visit_id
from (
select *,
row_number() over (partition by patient order by visit_id) rnk
from (
select patient,
min(visit_id) visit_id,
physician
from dataset
group by patient, physician
)
)
where rnk = 3
输出:
patient | visit_id |
---|---|
a | 4 |
c | 9 |
请注意,此查询使用了 presto 语法(因为您的问题具有 presto 标记)。