Postgresql 多行分组依据
Postgresql group by for multiple lines
我将这个 table 命名为 hr_holidays_by_calendar
。我只想过滤掉 同一名员工在同一天休假两次 的行。
Table hr_holidays_by_calendar
:
我试过的查询:
离解决这个问题还差得很远。
select hol1.employee_id, hol1.leave_date, hol1.no_of_days, hol1.leave_state
from hr_holidays_by_calendar hol1
inner join
(select employee_id, leave_date
from hr_holidays_by_calendar hol1
group by employee_id, leave_date
having count(*)>1)sub
on hol1.employee_id=sub.employee_id and hol1.leave_date=sub.leave_date
where hol1.leave_state != 'refuse'
order by hol1.employee_id, hol1.leave_date
我假设你只需要扭转你的逻辑。你可以使用 NOT EXISTS
:
select h1.employee_id, h1.leave_date, h1.no_of_days, h1.leave_state
from hr_holidays_by_calendar h1
where
h1.leave_state <> 'refuse'
and not exists (
select 1
from hr_holidays_by_calendar h2
where
h1.employee_id = h2.employee_id
and h1.leave_date = h2.leave_date
group by employee_id, leave_date
having count(*) > 1
)
这将丢弃每一对(员工,日期)超过一行的(同一天离开)。
我没有考虑天数,因为无论如何这似乎都是错误的 - 你不能在同一天两次请假,而这会持续不同的天数。如果您的应用程序允许,请考虑应用其他逻辑。此外,您不应该让这些记录首先进入 table :-)
我相信 GROUP BY
的简单使用可以为您完成这项工作
select hol1.employee_id, hol1.leave_date, max(hol1.no_of_days)
from hr_holidays_by_calendar hol1
where hol1.leave_state != 'refuse'
group by hol1.employee_id, hol1.leave_date
不清楚如果两行有不同的 no_of_days
.
会发生什么
如果您想要完整的行,一种方法使用 window 函数:
select hc.*
from (select hc.*, count(*) over (partition by employee_id, leave_date) as cnt
from hr_holidays_by_calendar hc
) hc
where cnt >= 2;
如果您只需要员工 ID 和日期,则聚合是合适的。
这returns存在重复的所有行:
SELECT employee_id, leave_date, no_of_days, leave_state
FROM hr_holidays_by_calendar h
WHERE EXISTS (
SELECT -- select list can be empty for EXISTS
FROM hr_holidays_by_calendar
WHERE employee_id = h.employee_id
AND leave_date = h.leave_date
AND leave_state <> 'refuse'
AND ctid <> h.ctid
)
AND leave_state <> 'refuse'
ORDER BY employee_id, leave_date;
尚不清楚 leave_state <> 'refuse'
应适用于何处。您必须定义要求。我的示例完全排除了 leave_state = 'refuse'
(以及 leave_state IS NULL
!)的行。
ctid
是您未公开(未定义?)主键的穷人代理人。
相关:
- How do I (or can I) SELECT DISTINCT on multiple columns?
- What is easier to read in EXISTS subqueries?
我将这个 table 命名为 hr_holidays_by_calendar
。我只想过滤掉 同一名员工在同一天休假两次 的行。
Table hr_holidays_by_calendar
:
我试过的查询:
离解决这个问题还差得很远。
select hol1.employee_id, hol1.leave_date, hol1.no_of_days, hol1.leave_state
from hr_holidays_by_calendar hol1
inner join
(select employee_id, leave_date
from hr_holidays_by_calendar hol1
group by employee_id, leave_date
having count(*)>1)sub
on hol1.employee_id=sub.employee_id and hol1.leave_date=sub.leave_date
where hol1.leave_state != 'refuse'
order by hol1.employee_id, hol1.leave_date
我假设你只需要扭转你的逻辑。你可以使用 NOT EXISTS
:
select h1.employee_id, h1.leave_date, h1.no_of_days, h1.leave_state
from hr_holidays_by_calendar h1
where
h1.leave_state <> 'refuse'
and not exists (
select 1
from hr_holidays_by_calendar h2
where
h1.employee_id = h2.employee_id
and h1.leave_date = h2.leave_date
group by employee_id, leave_date
having count(*) > 1
)
这将丢弃每一对(员工,日期)超过一行的(同一天离开)。
我没有考虑天数,因为无论如何这似乎都是错误的 - 你不能在同一天两次请假,而这会持续不同的天数。如果您的应用程序允许,请考虑应用其他逻辑。此外,您不应该让这些记录首先进入 table :-)
我相信 GROUP BY
的简单使用可以为您完成这项工作
select hol1.employee_id, hol1.leave_date, max(hol1.no_of_days)
from hr_holidays_by_calendar hol1
where hol1.leave_state != 'refuse'
group by hol1.employee_id, hol1.leave_date
不清楚如果两行有不同的 no_of_days
.
如果您想要完整的行,一种方法使用 window 函数:
select hc.*
from (select hc.*, count(*) over (partition by employee_id, leave_date) as cnt
from hr_holidays_by_calendar hc
) hc
where cnt >= 2;
如果您只需要员工 ID 和日期,则聚合是合适的。
这returns存在重复的所有行:
SELECT employee_id, leave_date, no_of_days, leave_state
FROM hr_holidays_by_calendar h
WHERE EXISTS (
SELECT -- select list can be empty for EXISTS
FROM hr_holidays_by_calendar
WHERE employee_id = h.employee_id
AND leave_date = h.leave_date
AND leave_state <> 'refuse'
AND ctid <> h.ctid
)
AND leave_state <> 'refuse'
ORDER BY employee_id, leave_date;
尚不清楚 leave_state <> 'refuse'
应适用于何处。您必须定义要求。我的示例完全排除了 leave_state = 'refuse'
(以及 leave_state IS NULL
!)的行。
ctid
是您未公开(未定义?)主键的穷人代理人。
相关:
- How do I (or can I) SELECT DISTINCT on multiple columns?
- What is easier to read in EXISTS subqueries?