RANK 函数不适用于 postgresql 中的 CTE table?
RANK function doesn't work on CTE table in posgresql?
我正在尝试根据评论对我的 CTE table (a) 输出进行排名,并按两列(risk_full_nm 和 exam_year_nb)划分。
下面是我的代码:
\`WITH a AS
(
select rr.risk_full_nm ,e.exam_year_nb,COUNT(DISTINCT e.exam_dmn_xtrnl_id) as "reviews"
from examwrksp.dsh_exam_base e
left join star.firm f ON f.firm_id = e.crd_nb --firm
left join examwrksp.dsh_mlstn_event_dates d ON d.exam_id = e.exam_id --exam milestones
left join examwrksp.dsh_scope_risk_cntnt rr on rr.exam_id = e.exam_id
where e.exam_year_nb \>= '2018' and e.exam_ctgry_cd = 'CYCLE' and e.exam_st_ds in ('Open','Closed') and e.dstrt_nm != 'MAP Group' and e.exam_type_ds not like 'Funding%'
and e.exam_dmn_prmry_fl = 'Y' and f.main_ofc = 'Y' and d.crd_nb is null
and rr.scope_actv_st is not null and rr.scope_actv_st = 'ACTIVE' and rr.unit_type_label_tx not in ('Discovery Review','Risk Identification Review')
GROUP BY rr.risk_full_nm ,e.exam_year_nb
)
select a.risk_full_nm,a.exam_year_nb,a.reviews,
RANK () OVER(PARTITION BY a.risk_full_nm,a.exam_year_nb ORDER BY a.reviews desc)
from a
order by reviews desc\`
我希望我的排名列对我的记录进行排名,但我在每一行中都得到“1”。我错过了什么?
我分析了你的查询。您的查询中有一些不正确的逻辑代码。我给你解释一下:
- Window 每次为每个分组字段计算聚合函数。但是,分组字段来自
over (partition by
语句之后。在您的查询中,您对两个字段进行了分组,a.risk_full_nm
和 a.exam_year_nb
。对这些字段进行分组后,结果数据的所有行都不同,因为您不必重复(相同)行。因此,RANK()
函数将针对所有行再次计算。为了更好地理解,我使用示例查询向您解释。
例如:
with test_data as materialized
(
select 1 as id, 'user1' as username, 'admin' as type_of, '2022-01-14'::date as login_date
union all
select 2 as id, 'user2' as username, 'user' as type_of, '2022-01-06'::date as login_date
union all
select 3 as id, 'user1' as username, 'admin' as type_of, '2022-01-29'::date as login_date
union all
select 4 as id, 'user3' as username, 'user' as type_of, '2022-02-11'::date as login_date
union all
select 5 as id, 'user2' as username, 'user' as type_of, '2022-01-16'::date as login_date
union all
select 6 as id, 'user2' as username, 'user' as type_of, '2022-01-17'::date as login_date
union all
select 7 as id, 'user2' as username, 'user' as type_of, '2022-01-18'::date as login_date
)
select
id,
username,
type_of,
count(*) over (partition by id, username, type_of) as login_count
from
test_data;
-- Result of this query
id username type_of login_count
-----------------------------------------
1 user1 admin 1
2 user2 user 1
3 user1 admin 1
4 user3 user 1
5 user2 user 1
6 user2 user 1
7 user2 user 1
在此查询中,我想获取每个用户的登录次数。这是不正确的语法。所以,我使用唯一的主键字段 'id' 进行分组,我的行将是不同的。因此,对于这些行中的每一行,函数 count() 仅获得 1 个值。
当我们从搜索字段列表中删除 'id' 字段时,我们返回了正确的数据并且 count() 计算正确。例如:
select
id,
username,
type_of,
count(*) over (partition by username, type_of) as login_count
from
test_data
-- Result this query:
id username type_of login_count
----------------------------------------
3 user1 admin 2
1 user1 admin 2
7 user2 user 4
2 user2 user 4
5 user2 user 4
6 user2 user 4
4 user3 user 1
我正在尝试根据评论对我的 CTE table (a) 输出进行排名,并按两列(risk_full_nm 和 exam_year_nb)划分。
下面是我的代码:
\`WITH a AS
(
select rr.risk_full_nm ,e.exam_year_nb,COUNT(DISTINCT e.exam_dmn_xtrnl_id) as "reviews"
from examwrksp.dsh_exam_base e
left join star.firm f ON f.firm_id = e.crd_nb --firm
left join examwrksp.dsh_mlstn_event_dates d ON d.exam_id = e.exam_id --exam milestones
left join examwrksp.dsh_scope_risk_cntnt rr on rr.exam_id = e.exam_id
where e.exam_year_nb \>= '2018' and e.exam_ctgry_cd = 'CYCLE' and e.exam_st_ds in ('Open','Closed') and e.dstrt_nm != 'MAP Group' and e.exam_type_ds not like 'Funding%'
and e.exam_dmn_prmry_fl = 'Y' and f.main_ofc = 'Y' and d.crd_nb is null
and rr.scope_actv_st is not null and rr.scope_actv_st = 'ACTIVE' and rr.unit_type_label_tx not in ('Discovery Review','Risk Identification Review')
GROUP BY rr.risk_full_nm ,e.exam_year_nb
)
select a.risk_full_nm,a.exam_year_nb,a.reviews,
RANK () OVER(PARTITION BY a.risk_full_nm,a.exam_year_nb ORDER BY a.reviews desc)
from a
order by reviews desc\`
我希望我的排名列对我的记录进行排名,但我在每一行中都得到“1”。我错过了什么?
我分析了你的查询。您的查询中有一些不正确的逻辑代码。我给你解释一下:
- Window 每次为每个分组字段计算聚合函数。但是,分组字段来自
over (partition by
语句之后。在您的查询中,您对两个字段进行了分组,a.risk_full_nm
和a.exam_year_nb
。对这些字段进行分组后,结果数据的所有行都不同,因为您不必重复(相同)行。因此,RANK()
函数将针对所有行再次计算。为了更好地理解,我使用示例查询向您解释。
例如:
with test_data as materialized
(
select 1 as id, 'user1' as username, 'admin' as type_of, '2022-01-14'::date as login_date
union all
select 2 as id, 'user2' as username, 'user' as type_of, '2022-01-06'::date as login_date
union all
select 3 as id, 'user1' as username, 'admin' as type_of, '2022-01-29'::date as login_date
union all
select 4 as id, 'user3' as username, 'user' as type_of, '2022-02-11'::date as login_date
union all
select 5 as id, 'user2' as username, 'user' as type_of, '2022-01-16'::date as login_date
union all
select 6 as id, 'user2' as username, 'user' as type_of, '2022-01-17'::date as login_date
union all
select 7 as id, 'user2' as username, 'user' as type_of, '2022-01-18'::date as login_date
)
select
id,
username,
type_of,
count(*) over (partition by id, username, type_of) as login_count
from
test_data;
-- Result of this query
id username type_of login_count
-----------------------------------------
1 user1 admin 1
2 user2 user 1
3 user1 admin 1
4 user3 user 1
5 user2 user 1
6 user2 user 1
7 user2 user 1
在此查询中,我想获取每个用户的登录次数。这是不正确的语法。所以,我使用唯一的主键字段 'id' 进行分组,我的行将是不同的。因此,对于这些行中的每一行,函数 count() 仅获得 1 个值。 当我们从搜索字段列表中删除 'id' 字段时,我们返回了正确的数据并且 count() 计算正确。例如:
select
id,
username,
type_of,
count(*) over (partition by username, type_of) as login_count
from
test_data
-- Result this query:
id username type_of login_count
----------------------------------------
3 user1 admin 2
1 user1 admin 2
7 user2 user 4
2 user2 user 4
5 user2 user 4
6 user2 user 4
4 user3 user 1