你如何解决这个 SQL 查询?
How do you solve this SQL query?
在这个难题中,我们必须根据患者入院日期和出院日期对数据进行分组。如果任何患者出院日期 + 1 = 入院日期,那么我们将两行分组为一行,并对两行的成本求和。请查看示例输入和预期输出以了解详细信息。
Sample Input
PatientID AdmissionDate DischargeDate Cost
1009 27-07-2014 31-07-2014 1050
1009 01-08-2014 23-08-2014 1070
1009 31-08-2014 31-08-2014 1900
1009 01-09-2014 14-09-2014 1260
1009 01-12-2014 31-12-2014 2090
1024 07-06-2014 28-06-2014 1900
1024 29-06-2014 31-07-2014 2900
1024 01-08-2014 02-08-2014 1800
Expected Output
PatientId AdminssionDate DischargeDate Cost
1009 27-07-2014 23-08-2014 2120
1009 31-08-2014 14-09-2014 3160
1009 01-12-2014 31-12-2014 2090
1024 07-06-2014 02-08-2014 6600
我想不出解决办法。
用它来生成 table:
CREATE TABLE PatientProblem
(
PatientID integer,
AdmissionDate date,
DischargeDate date,
Cost numeric(20,2)
);
--Insert Data
INSERT INTO PatientProblem(PatientID,AdmissionDate,DischargeDate
,Cost)
VALUES
(1009,'2014-07-27','2014-07-31',1050.00),
(1009,'2014-08-01','2014-08-23',1070.00),
(1009,'2014-08-31','2014-08-31',1900.00),
(1009,'2014-09-01','2014-09-14',1260.00),
(1009,'2014-12-01','2014-12-31',2090.00),
(1024,'2014-06-07','2014-06-28',1900.00),
(1024,'2014-06-29','2014-07-31',2900.00),
(1024,'2014-08-01','2014-08-02',1800.00)
这称为间隙和孤岛问题。我们通常用 window 函数来解决这个问题。使用 LAG
您可以看到上一行的值。这有助于我们标记每个患者开始新日期范围的行。然后,通过构建标记数量的 运行 计数,我们可以获得患者日期范围的组号。
select
patientid,
min(admissiondate) as range_start,
max(dischargedate) as range_end,
sum(cost) as total_cost
from
(
select
patientid, admissiondate, dischargedate, cost,
count(marker) over (partition by patientid order by admissiondate) as grp
from
(
select
patientid, admissiondate, dischargedate, cost,
case when admissiondate > lag(dischargedate) over (partition by patientid order by admissiondate) + interval '1 day' then
1
end as marker
from mytable
) marked
) grouped
group by patientid, grp
order by patientid, grp;
演示:https://dbfiddle.uk/?rdbms=postgres_14&fiddle=d2ee8fede9bb999ada79310047e5ae27
在这个难题中,我们必须根据患者入院日期和出院日期对数据进行分组。如果任何患者出院日期 + 1 = 入院日期,那么我们将两行分组为一行,并对两行的成本求和。请查看示例输入和预期输出以了解详细信息。
Sample Input
PatientID AdmissionDate DischargeDate Cost
1009 27-07-2014 31-07-2014 1050
1009 01-08-2014 23-08-2014 1070
1009 31-08-2014 31-08-2014 1900
1009 01-09-2014 14-09-2014 1260
1009 01-12-2014 31-12-2014 2090
1024 07-06-2014 28-06-2014 1900
1024 29-06-2014 31-07-2014 2900
1024 01-08-2014 02-08-2014 1800
Expected Output
PatientId AdminssionDate DischargeDate Cost
1009 27-07-2014 23-08-2014 2120
1009 31-08-2014 14-09-2014 3160
1009 01-12-2014 31-12-2014 2090
1024 07-06-2014 02-08-2014 6600
我想不出解决办法。
用它来生成 table:
CREATE TABLE PatientProblem
(
PatientID integer,
AdmissionDate date,
DischargeDate date,
Cost numeric(20,2)
);
--Insert Data
INSERT INTO PatientProblem(PatientID,AdmissionDate,DischargeDate
,Cost)
VALUES
(1009,'2014-07-27','2014-07-31',1050.00),
(1009,'2014-08-01','2014-08-23',1070.00),
(1009,'2014-08-31','2014-08-31',1900.00),
(1009,'2014-09-01','2014-09-14',1260.00),
(1009,'2014-12-01','2014-12-31',2090.00),
(1024,'2014-06-07','2014-06-28',1900.00),
(1024,'2014-06-29','2014-07-31',2900.00),
(1024,'2014-08-01','2014-08-02',1800.00)
这称为间隙和孤岛问题。我们通常用 window 函数来解决这个问题。使用 LAG
您可以看到上一行的值。这有助于我们标记每个患者开始新日期范围的行。然后,通过构建标记数量的 运行 计数,我们可以获得患者日期范围的组号。
select
patientid,
min(admissiondate) as range_start,
max(dischargedate) as range_end,
sum(cost) as total_cost
from
(
select
patientid, admissiondate, dischargedate, cost,
count(marker) over (partition by patientid order by admissiondate) as grp
from
(
select
patientid, admissiondate, dischargedate, cost,
case when admissiondate > lag(dischargedate) over (partition by patientid order by admissiondate) + interval '1 day' then
1
end as marker
from mytable
) marked
) grouped
group by patientid, grp
order by patientid, grp;
演示:https://dbfiddle.uk/?rdbms=postgres_14&fiddle=d2ee8fede9bb999ada79310047e5ae27