SQL 个不同的项目工作日期,不包括休息日期
SQL distinct Worked Dates across Projects excluding Break Dates
考虑以下架构;
CREATE TABLE `Project Assignment`
(`Employee` varchar(1), `Project Id` int, `Project Assignment Date` date, `Project Relieving Date` date)
;
INSERT INTO `Project Assignment`
(`Employee`, `Project Id`, `Project Assignment Date`, `Project Relieving Date`)
VALUES
('A', 1, '2018-04-01', '2019-12-25'),
('A', 2, '2019-06-15', '2020-03-31'),
('A', 3, '2019-09-07', '2020-05-20'),
('A', 4, '2020-07-14', '2020-12-15')
;
CREATE TABLE `Break`
(`Break Id` int, `Employee` varchar(1), `Project Id` int, `Break Start Date` date, `Break End Date` date)
;
INSERT INTO `Break`
(`Break Id`, `Employee`, `Project Id`, `Break Start Date`, `Break End Date`)
VALUES
(1, 'A', 1, '2018-09-01', '2018-09-30'),
(2, 'A', 1, '2019-10-05', '2019-11-30'),
(3, 'A', 2, '2019-10-15', '2019-11-15'),
(4, 'A', 3, '2019-11-01', '2019-11-10'),
(5, 'A', 2, '2020-01-01', '2020-01-10'),
(6, 'A', 3, '2020-01-01', '2020-01-10')
;
在项目期间,员工可以在每个项目中休息一次或多次。休息时间在项目中不重叠,但可以跨项目重叠。
我们想要计算一名员工至少分配了一个项目的天数(减去)员工在所有分配的项目上休息的天数。
我能够使用以下查询得出员工分配给项目的不同天数:
SELECT merged.employee,
SUM(DATEDIFF(merged.EndDate,merged.`Project Assignment Date`)+1) assigned_days
FROM (SELECT
s1.employee, s1.`Project Assignment Date`,
MIN(IFNULL(t1.`Project Relieving Date`,CURDATE())) AS EndDate
FROM `Project Assignment` s1
INNER JOIN `Project Assignment` t1
ON t1.employee = s1.employee
AND s1.`Project Assignment Date` <= IFNULL(t1.`Project Relieving Date`,CURDATE())
AND NOT EXISTS( SELECT * FROM `Project Assignment` t2
WHERE t2.employee = s1.employee
AND IFNULL(t1.`Project Relieving Date`,CURDATE()) >= t2.`Project Assignment Date`
AND IFNULL(t1.`Project Relieving Date`,CURDATE()) < IFNULL(t2.`Project Relieving Date`,CURDATE()))
WHERE NOT EXISTS( SELECT * FROM `Project Assignment` s2
WHERE s2.employee = s1.employee
AND s1.`Project Assignment Date` > s2.`Project Assignment Date`
AND s1.`Project Assignment Date` <= IFNULL(s2.`Project Relieving Date`,CURDATE()))
GROUP BY s1.employee, s1.`Project Assignment Date`
ORDER BY s1.`Project Assignment Date`) merged
GROUP BY merged.employee
结果:
| employee | assigned_days |
| -------- | ------------- |
| A | 936 |
但想不出一种方法来得出此人在所有分配的项目中休息的天数。
预期结果:
+----------+---------------+------------+-------------+
| employee | assigned_days | break_days | worked_days |
+==========+===============+============+=============+
| A | 936 | 50 | 886 |
+----------+---------------+------------+-------------+
Mariadb 10.3.29
锻炼说明break_days
+----------+---------+-------------+------------------+-----------------+-------------------------------------------------------------------------------------------------------------------+
| Employee | Project | Break Start | Break End | Days Considered | Remarks |
+==========+=========+=============+==================+=================+===================================================================================================================+
| A | 1 | 2018-09-01 | 2018-09-30 | 30 | Only one project assigned so consider whole break |
+----------+---------+-------------+------------------+-----------------+-------------------------------------------------------------------------------------------------------------------+
| A | 1 | 2019-10-05 | 2019-11-30 | 10 | 3 Projects were assigned during these breaks. The common days of break fall between 2019-11-01 and 2019-11-10 |
+----------+---------+-------------+------------------+ | |
| A | 2 | 2019-10-15 | 2019-11-15 | | |
+----------+---------+-------------+------------------+ | |
| A | 3 | 2019-11-01 | 2019-11-10 | | |
+----------+---------+-------------+------------------+-----------------+-------------------------------------------------------------------------------------------------------------------+
| A | 2 | 2020-01-01 | 2020-01-10 | 10 | 2 Projects were assigned during this time and break in both projects |
+----------+---------+-------------+------------------+ | |
| A | 3 | 2020-01-01 | 2020-01-10 | | |
+----------+---------+-------------+------------------+-----------------+-------------------------------------------------------------------------------------------------------------------+
| | | | Total Break Days | 50 | |
+----------+---------+-------------+------------------+-----------------+-------------------------------------------------------------------------------------------------------------------+
Link 对于 DB-Fiddle:https://www.db-fiddle.com/f/c8fMneAUkhb2P3rzjMtVZm/0
使用递归 CTE 获取每个员工的所有工作和所有休息日期。
然后,对于这两种情况下的每个日期,使用聚合将所有项目作为逗号分隔列表 GROUP_CONCAT()
.
如果这些列表与特定日期匹配,那么这是一个休息日期。
WITH RECURSIVE
working_dates AS (
SELECT `Employee`, `Project Id`, `Project Assignment Date` AS date, `Project Relieving Date`
FROM `Project Assignment`
UNION ALL
SELECT `Employee`, `Project Id`, date + INTERVAL 1 day, `Project Relieving Date`
FROM working_dates
WHERE date < `Project Relieving Date`
),
break_dates AS (
SELECT `Employee`, `Project Id`, `Break Start Date` AS date, `Break End Date`
FROM `Break`
UNION ALL
SELECT `Employee`, `Project Id`, date + INTERVAL 1 day, `Break End Date`
FROM break_dates
WHERE date < `Break End Date`
),
working AS (
SELECT `Employee`, date,
GROUP_CONCAT(`Project Id` ORDER BY `Project Id`) projects
FROM working_dates
GROUP BY `Employee`, date
),
breaks AS (
SELECT `Employee`, date,
GROUP_CONCAT(`Project Id` ORDER BY `Project Id`) projects
FROM break_dates
GROUP BY `Employee`, date
)
SELECT w.`Employee`,
COUNT(*) assigned_days,
COUNT(b.date) AS break_days,
COUNT(*) - COUNT(b.date) worked_days
FROM working w LEFT JOIN breaks b
ON w.`Employee` = b.`Employee` AND w.date = b.date AND w.projects = b.projects
GROUP BY w.`Employee`
参见demo。
将 Break Id
列添加到 Break
table 后,我可以根据 @forpass 建议的聚合技术来推导休息日:
Then, for each date in both cases, with aggregation get all the projects as a comma separated list with GROUP_CONCAT().
对于每个中断,获取重叠项目的计数和列表(使用 GROUP_CONCAT)。
然后在 Break
上再次加入它以查找重叠中断的计数和列表以及最小的公共重叠(最晚开始和最早结束)。使用 ROW_NUMBER
消除重复项。
将 Assigned Days 的查询移动到另一个 CTE 并与 CTE 合并以获取休息时间
想要的结果。
WITH breaks_summary AS (
SELECT `Employee`, SUM(break_days) break_days
FROM (
SELECT b.`Employee`, DATEDIFF(b.end_date, b.start_date)+1 break_days, ROW_NUMBER() OVER (PARTITION BY b.break_ids) rn, overlapping_breaks, break_ids, projects_count
FROM (
SELECT b_p_cnt.`Employee`, b_p_cnt.`Project Id`, b_p_cnt.projects_count,
COUNT(b2.`Break Id`) overlapping_breaks, GROUP_CONCAT(b2.`Break Id`) break_ids, MAX(b2.start_date) start_date, MIN(b2.end_date) end_date
FROM (
SELECT b1.`Break Id`, b1.`Employee`, b1.`Project Id`, b1.start_date, b1.end_date, GROUP_CONCAT(pa.`Project Id`) projects, count(pa.`Project Id`) projects_count
FROM (
SELECT `Break Id`, `Employee`, `Project Id`, `Break Start Date` AS start_date, `Break End Date` AS end_date
FROM `Break`
) b1
LEFT JOIN `Project Assignment` pa ON b1.`Employee` = pa.`Employee`
AND ((b1.start_date BETWEEN pa.`Project Assignment Date` AND IFNULL(pa.`Project Relieving Date`,CURDATE()))
OR (b1.end_date BETWEEN pa.`Project Assignment Date` AND IFNULL(pa.`Project Relieving Date`,CURDATE())))
GROUP BY b1.`Break Id`, b1.`Employee`, b1.`Project Id`, b1.start_date, b1.end_date) b_p_cnt
LEFT JOIN (
SELECT `Break Id`, `Employee`, `Project Id`, `Break Start Date` AS start_date, `Break End Date` AS end_date
FROM `Break`
ORDER BY `Break Id`) b2 ON b_p_cnt.`Employee` = b2.`Employee`
AND ((b_p_cnt.start_date BETWEEN b2.start_date AND b2.end_date)
OR (b_p_cnt.end_date BETWEEN b2.start_date AND b2.end_date))
GROUP BY b_p_cnt.`Break Id`, b_p_cnt.`Employee`, b_p_cnt.`Project Id`,
b_p_cnt.start_date, b_p_cnt.end_date, b_p_cnt.projects, b_p_cnt.projects_count
HAVING count(b2.`Break Id`) = b_p_cnt.projects_count
ORDER BY b_p_cnt.`Employee`, `Project Id`) b
) breaks
WHERE rn = 1
GROUP BY `Employee`),
assigned AS (
SELECT merged.`Employee`, SUM(DATEDIFF(merged.EndDate,merged.`Project Assignment Date`)+1) assigned_days
FROM (SELECT s1.`Employee`, s1.`Project Assignment Date`,
MIN(IFNULL(t1.`Project Relieving Date`,CURDATE())) AS EndDate
FROM `Project Assignment` s1
INNER JOIN `Project Assignment` t1 ON t1.`Employee` = s1.`Employee`
AND s1.`Project Assignment Date` <= IFNULL(t1.`Project Relieving Date`,CURDATE())
AND NOT EXISTS( SELECT * FROM `Project Assignment` t2
WHERE t2.`Employee` = s1.`Employee`
AND IFNULL(t1.`Project Relieving Date`,CURDATE()) >= t2.`Project Assignment Date`
AND IFNULL(t1.`Project Relieving Date`,CURDATE()) < IFNULL(t2.`Project Relieving Date`,CURDATE()))
WHERE NOT EXISTS( SELECT * FROM `Project Assignment` s2
WHERE s2.`Employee` = s1.`Employee`
AND s1.`Project Assignment Date` > s2.`Project Assignment Date`
AND s1.`Project Assignment Date` <= IFNULL(s2.`Project Relieving Date`,CURDATE()))
GROUP BY s1.`Employee`, s1.`Project Assignment Date`
ORDER BY s1.`Project Assignment Date`) merged
GROUP BY merged.`Employee`)
SELECT ad.`Employee`,
ad.assigned_days,
IFNULL(bs.break_days,0) break_days,
(ad.assigned_days - IFNULL(bs.break_days,0)) worked_days
FROM assigned ad
LEFT JOIN breaks_summary bs ON ad.`Employee` = bs.`Employee`
已更新 DB-Fiddle 查询:https://www.db-fiddle.com/f/c8fMneAUkhb2P3rzjMtVZm/3
感谢所有通过改进问题和提供可能答案做出贡献的人。
考虑以下架构;
CREATE TABLE `Project Assignment`
(`Employee` varchar(1), `Project Id` int, `Project Assignment Date` date, `Project Relieving Date` date)
;
INSERT INTO `Project Assignment`
(`Employee`, `Project Id`, `Project Assignment Date`, `Project Relieving Date`)
VALUES
('A', 1, '2018-04-01', '2019-12-25'),
('A', 2, '2019-06-15', '2020-03-31'),
('A', 3, '2019-09-07', '2020-05-20'),
('A', 4, '2020-07-14', '2020-12-15')
;
CREATE TABLE `Break`
(`Break Id` int, `Employee` varchar(1), `Project Id` int, `Break Start Date` date, `Break End Date` date)
;
INSERT INTO `Break`
(`Break Id`, `Employee`, `Project Id`, `Break Start Date`, `Break End Date`)
VALUES
(1, 'A', 1, '2018-09-01', '2018-09-30'),
(2, 'A', 1, '2019-10-05', '2019-11-30'),
(3, 'A', 2, '2019-10-15', '2019-11-15'),
(4, 'A', 3, '2019-11-01', '2019-11-10'),
(5, 'A', 2, '2020-01-01', '2020-01-10'),
(6, 'A', 3, '2020-01-01', '2020-01-10')
;
在项目期间,员工可以在每个项目中休息一次或多次。休息时间在项目中不重叠,但可以跨项目重叠。
我们想要计算一名员工至少分配了一个项目的天数(减去)员工在所有分配的项目上休息的天数。
我能够使用以下查询得出员工分配给项目的不同天数:
SELECT merged.employee,
SUM(DATEDIFF(merged.EndDate,merged.`Project Assignment Date`)+1) assigned_days
FROM (SELECT
s1.employee, s1.`Project Assignment Date`,
MIN(IFNULL(t1.`Project Relieving Date`,CURDATE())) AS EndDate
FROM `Project Assignment` s1
INNER JOIN `Project Assignment` t1
ON t1.employee = s1.employee
AND s1.`Project Assignment Date` <= IFNULL(t1.`Project Relieving Date`,CURDATE())
AND NOT EXISTS( SELECT * FROM `Project Assignment` t2
WHERE t2.employee = s1.employee
AND IFNULL(t1.`Project Relieving Date`,CURDATE()) >= t2.`Project Assignment Date`
AND IFNULL(t1.`Project Relieving Date`,CURDATE()) < IFNULL(t2.`Project Relieving Date`,CURDATE()))
WHERE NOT EXISTS( SELECT * FROM `Project Assignment` s2
WHERE s2.employee = s1.employee
AND s1.`Project Assignment Date` > s2.`Project Assignment Date`
AND s1.`Project Assignment Date` <= IFNULL(s2.`Project Relieving Date`,CURDATE()))
GROUP BY s1.employee, s1.`Project Assignment Date`
ORDER BY s1.`Project Assignment Date`) merged
GROUP BY merged.employee
结果:
| employee | assigned_days |
| -------- | ------------- |
| A | 936 |
但想不出一种方法来得出此人在所有分配的项目中休息的天数。
预期结果:
+----------+---------------+------------+-------------+
| employee | assigned_days | break_days | worked_days |
+==========+===============+============+=============+
| A | 936 | 50 | 886 |
+----------+---------------+------------+-------------+
Mariadb 10.3.29
锻炼说明break_days
+----------+---------+-------------+------------------+-----------------+-------------------------------------------------------------------------------------------------------------------+
| Employee | Project | Break Start | Break End | Days Considered | Remarks |
+==========+=========+=============+==================+=================+===================================================================================================================+
| A | 1 | 2018-09-01 | 2018-09-30 | 30 | Only one project assigned so consider whole break |
+----------+---------+-------------+------------------+-----------------+-------------------------------------------------------------------------------------------------------------------+
| A | 1 | 2019-10-05 | 2019-11-30 | 10 | 3 Projects were assigned during these breaks. The common days of break fall between 2019-11-01 and 2019-11-10 |
+----------+---------+-------------+------------------+ | |
| A | 2 | 2019-10-15 | 2019-11-15 | | |
+----------+---------+-------------+------------------+ | |
| A | 3 | 2019-11-01 | 2019-11-10 | | |
+----------+---------+-------------+------------------+-----------------+-------------------------------------------------------------------------------------------------------------------+
| A | 2 | 2020-01-01 | 2020-01-10 | 10 | 2 Projects were assigned during this time and break in both projects |
+----------+---------+-------------+------------------+ | |
| A | 3 | 2020-01-01 | 2020-01-10 | | |
+----------+---------+-------------+------------------+-----------------+-------------------------------------------------------------------------------------------------------------------+
| | | | Total Break Days | 50 | |
+----------+---------+-------------+------------------+-----------------+-------------------------------------------------------------------------------------------------------------------+
Link 对于 DB-Fiddle:https://www.db-fiddle.com/f/c8fMneAUkhb2P3rzjMtVZm/0
使用递归 CTE 获取每个员工的所有工作和所有休息日期。
然后,对于这两种情况下的每个日期,使用聚合将所有项目作为逗号分隔列表 GROUP_CONCAT()
.
如果这些列表与特定日期匹配,那么这是一个休息日期。
WITH RECURSIVE
working_dates AS (
SELECT `Employee`, `Project Id`, `Project Assignment Date` AS date, `Project Relieving Date`
FROM `Project Assignment`
UNION ALL
SELECT `Employee`, `Project Id`, date + INTERVAL 1 day, `Project Relieving Date`
FROM working_dates
WHERE date < `Project Relieving Date`
),
break_dates AS (
SELECT `Employee`, `Project Id`, `Break Start Date` AS date, `Break End Date`
FROM `Break`
UNION ALL
SELECT `Employee`, `Project Id`, date + INTERVAL 1 day, `Break End Date`
FROM break_dates
WHERE date < `Break End Date`
),
working AS (
SELECT `Employee`, date,
GROUP_CONCAT(`Project Id` ORDER BY `Project Id`) projects
FROM working_dates
GROUP BY `Employee`, date
),
breaks AS (
SELECT `Employee`, date,
GROUP_CONCAT(`Project Id` ORDER BY `Project Id`) projects
FROM break_dates
GROUP BY `Employee`, date
)
SELECT w.`Employee`,
COUNT(*) assigned_days,
COUNT(b.date) AS break_days,
COUNT(*) - COUNT(b.date) worked_days
FROM working w LEFT JOIN breaks b
ON w.`Employee` = b.`Employee` AND w.date = b.date AND w.projects = b.projects
GROUP BY w.`Employee`
参见demo。
将 Break Id
列添加到 Break
table 后,我可以根据 @forpass 建议的聚合技术来推导休息日:
Then, for each date in both cases, with aggregation get all the projects as a comma separated list with GROUP_CONCAT().
对于每个中断,获取重叠项目的计数和列表(使用 GROUP_CONCAT)。
然后在 Break
上再次加入它以查找重叠中断的计数和列表以及最小的公共重叠(最晚开始和最早结束)。使用 ROW_NUMBER
消除重复项。
将 Assigned Days 的查询移动到另一个 CTE 并与 CTE 合并以获取休息时间 想要的结果。
WITH breaks_summary AS (
SELECT `Employee`, SUM(break_days) break_days
FROM (
SELECT b.`Employee`, DATEDIFF(b.end_date, b.start_date)+1 break_days, ROW_NUMBER() OVER (PARTITION BY b.break_ids) rn, overlapping_breaks, break_ids, projects_count
FROM (
SELECT b_p_cnt.`Employee`, b_p_cnt.`Project Id`, b_p_cnt.projects_count,
COUNT(b2.`Break Id`) overlapping_breaks, GROUP_CONCAT(b2.`Break Id`) break_ids, MAX(b2.start_date) start_date, MIN(b2.end_date) end_date
FROM (
SELECT b1.`Break Id`, b1.`Employee`, b1.`Project Id`, b1.start_date, b1.end_date, GROUP_CONCAT(pa.`Project Id`) projects, count(pa.`Project Id`) projects_count
FROM (
SELECT `Break Id`, `Employee`, `Project Id`, `Break Start Date` AS start_date, `Break End Date` AS end_date
FROM `Break`
) b1
LEFT JOIN `Project Assignment` pa ON b1.`Employee` = pa.`Employee`
AND ((b1.start_date BETWEEN pa.`Project Assignment Date` AND IFNULL(pa.`Project Relieving Date`,CURDATE()))
OR (b1.end_date BETWEEN pa.`Project Assignment Date` AND IFNULL(pa.`Project Relieving Date`,CURDATE())))
GROUP BY b1.`Break Id`, b1.`Employee`, b1.`Project Id`, b1.start_date, b1.end_date) b_p_cnt
LEFT JOIN (
SELECT `Break Id`, `Employee`, `Project Id`, `Break Start Date` AS start_date, `Break End Date` AS end_date
FROM `Break`
ORDER BY `Break Id`) b2 ON b_p_cnt.`Employee` = b2.`Employee`
AND ((b_p_cnt.start_date BETWEEN b2.start_date AND b2.end_date)
OR (b_p_cnt.end_date BETWEEN b2.start_date AND b2.end_date))
GROUP BY b_p_cnt.`Break Id`, b_p_cnt.`Employee`, b_p_cnt.`Project Id`,
b_p_cnt.start_date, b_p_cnt.end_date, b_p_cnt.projects, b_p_cnt.projects_count
HAVING count(b2.`Break Id`) = b_p_cnt.projects_count
ORDER BY b_p_cnt.`Employee`, `Project Id`) b
) breaks
WHERE rn = 1
GROUP BY `Employee`),
assigned AS (
SELECT merged.`Employee`, SUM(DATEDIFF(merged.EndDate,merged.`Project Assignment Date`)+1) assigned_days
FROM (SELECT s1.`Employee`, s1.`Project Assignment Date`,
MIN(IFNULL(t1.`Project Relieving Date`,CURDATE())) AS EndDate
FROM `Project Assignment` s1
INNER JOIN `Project Assignment` t1 ON t1.`Employee` = s1.`Employee`
AND s1.`Project Assignment Date` <= IFNULL(t1.`Project Relieving Date`,CURDATE())
AND NOT EXISTS( SELECT * FROM `Project Assignment` t2
WHERE t2.`Employee` = s1.`Employee`
AND IFNULL(t1.`Project Relieving Date`,CURDATE()) >= t2.`Project Assignment Date`
AND IFNULL(t1.`Project Relieving Date`,CURDATE()) < IFNULL(t2.`Project Relieving Date`,CURDATE()))
WHERE NOT EXISTS( SELECT * FROM `Project Assignment` s2
WHERE s2.`Employee` = s1.`Employee`
AND s1.`Project Assignment Date` > s2.`Project Assignment Date`
AND s1.`Project Assignment Date` <= IFNULL(s2.`Project Relieving Date`,CURDATE()))
GROUP BY s1.`Employee`, s1.`Project Assignment Date`
ORDER BY s1.`Project Assignment Date`) merged
GROUP BY merged.`Employee`)
SELECT ad.`Employee`,
ad.assigned_days,
IFNULL(bs.break_days,0) break_days,
(ad.assigned_days - IFNULL(bs.break_days,0)) worked_days
FROM assigned ad
LEFT JOIN breaks_summary bs ON ad.`Employee` = bs.`Employee`
已更新 DB-Fiddle 查询:https://www.db-fiddle.com/f/c8fMneAUkhb2P3rzjMtVZm/3
感谢所有通过改进问题和提供可能答案做出贡献的人。