SQL: CTE查询速度
SQL: CTE query Speed
我正在使用 SQL Server 2008 并试图提高下面的查询速度。该查询根据再入院日期为患者分配分数。
示例:患者在 1/2、1/5、1/7、1/8、1/9、2/4 就诊。我想在彼此相隔 3 天内进行首次团体访问。 1/2-5 分组,1/7-9 分组。 1/5 未与 1/7 分组,因为 1/5 的实际访问日期是 1/2。 1/7 将获得 3 分,因为它是 1/2 的重新录取。 2/4 也将获得 3 分,因为它是 1/7 的重新录取。当日期分组时,第一个日期是实际访问日期。
大多数文章建议限制数据集或添加索引以提高速度。我已将行数限制在大约 15,000 行并添加了一个索引。当 运行 使用 45 个测试访问日期/3 个测试患者进行查询时,查询需要 1.5 分钟到 运行。使用我的实际数据集需要 > 8 小时。
我怎样才能使这个查询 运行 < 1 小时?有没有更好的方法来编写我的查询?我的索引看起来正确吗?任何帮助将不胜感激。
查询下方的预期结果示例。
;CREATE TABLE RiskReadmits(MRN INT, VisitDate DATE, Category VARCHAR(15))
;CREATE CLUSTERED INDEX Risk_Readmits_Index ON RiskReadmits(VisitDate)
;INSERT RiskReadmits(MRN,VisitDate,CATEGORY)
VALUES
(1, '1/2/2016','Inpatient'),
(1, '1/5/2016','Inpatient'),
(1, '1/7/2016','Inpatient'),
(1, '1/8/2016','Inpatient'),
(1, '1/9/2016','Inpatient'),
(1, '2/4/2016','Inpatient'),
(1, '6/2/2016','Inpatient'),
(1, '6/3/2016','Inpatient'),
(1, '6/5/2016','Inpatient'),
(1, '6/6/2016','Inpatient'),
(1, '6/8/2016','Inpatient'),
(1, '7/1/2016','Inpatient'),
(1, '8/1/2016','Inpatient'),
(1, '8/4/2016','Inpatient'),
(1, '8/15/2016','Inpatient'),
(1, '8/18/2016','Inpatient'),
(1, '8/28/2016','Inpatient'),
(1, '10/12/2016','Inpatient'),
(1, '10/15/2016','Inpatient'),
(1, '11/17/2016','Inpatient'),
(1, '12/20/2016','Inpatient')
;WITH a AS (
SELECT
z1.VisitDate
, z1.MRN
, (SELECT MIN(VisitDate) FROM RiskReadmits WHERE VisitDate > DATEADD(day, 3, z1.VisitDate)) AS NextDay
FROM
RiskReadmits z1
WHERE
CATEGORY = 'Inpatient'
), a1 AS (
SELECT
MRN
, MIN(VisitDate) AS VisitDate
, MIN(NextDay) AS NextDay
FROM
a
GROUP BY
MRN
), b AS (
SELECT
VisitDate
, MRN
, NextDay
, 1 AS OrderRow
FROM
a1
UNION ALL
SELECT
a.VisitDate
, a.MRN
, a.NextDay
, b.OrderRow +1 AS OrderRow
FROM
a
JOIN b
ON a.VisitDate = b.NextDay
), c AS (
SELECT
MRN,
VisitDate
, (SELECT MAX(VisitDate) FROM b WHERE b1.VisitDate > VisitDate AND b.MRN = b1.MRN) AS PreviousVisitDate
FROM
b b1
)
SELECT distinct
c1.MRN,
c1.VisitDate
, CASE
WHEN DATEDIFF(day,c1.PreviousVisitDate,c1.VisitDate) < 30 THEN PreviousVisitDate
ELSE NULL
END AS ReAdmissionFrom
, CASE
WHEN DATEDIFF(day,c1.PreviousVisitDate,c1.VisitDate) < 30 THEN 3
ELSE 0
END AS Points
FROM
c c1
ORDER BY c1.MRN
预期结果:
MRN VisitDate ReAdmissionFrom Points
1 2016-01-02 NULL 0
1 2016-01-07 2016-01-02 3
1 2016-02-04 2016-01-07 3
1 2016-06-02 NULL 0
1 2016-06-06 2016-06-02 3
1 2016-07-01 2016-06-06 3
1 2016-08-01 NULL 0
1 2016-08-15 2016-08-01 3
1 2016-08-28 2016-08-15 3
1 2016-10-12 NULL 0
1 2016-11-17 NULL 0
1 2016-12-20 NULL 0
我要大胆猜测一下,你想把 b cte 改成
将 AND a.MRN = b.MRN 作为第二个 select 查询中的第二个条件,如下所示:
, b AS (
SELECT
VisitDate
, MRN
, NextDay
, 1 AS OrderRow
FROM
firstVisitAndFollowUp
UNION ALL
SELECT
a.VisitDate
, a.MRN
, a.NextDay
, b.OrderRow +1 AS OrderRow
FROM
visitsDistance3daysOrMore AS a
JOIN b
ON a.VisitDate = b.NextDay AND a.MRN = b.MRN
)
糟糕,我更改了几个 cte 的名称(post 弄乱了代码)
应该是这样的:
b AS (
SELECT
访问日期
, 核磁共振网络
, 明天
, 1 作为订单行
从
a1
UNION ALL
SELECT
a.VisitDate
, a.MRN
, a.NextDay
, b.OrderRow +1 AS OrderRow
FROM
a AS a
JOIN b
ON a.VisitDate = b.NextDay AND a.MRN = b.MRN
)
我正在使用 SQL Server 2008 并试图提高下面的查询速度。该查询根据再入院日期为患者分配分数。
示例:患者在 1/2、1/5、1/7、1/8、1/9、2/4 就诊。我想在彼此相隔 3 天内进行首次团体访问。 1/2-5 分组,1/7-9 分组。 1/5 未与 1/7 分组,因为 1/5 的实际访问日期是 1/2。 1/7 将获得 3 分,因为它是 1/2 的重新录取。 2/4 也将获得 3 分,因为它是 1/7 的重新录取。当日期分组时,第一个日期是实际访问日期。
大多数文章建议限制数据集或添加索引以提高速度。我已将行数限制在大约 15,000 行并添加了一个索引。当 运行 使用 45 个测试访问日期/3 个测试患者进行查询时,查询需要 1.5 分钟到 运行。使用我的实际数据集需要 > 8 小时。 我怎样才能使这个查询 运行 < 1 小时?有没有更好的方法来编写我的查询?我的索引看起来正确吗?任何帮助将不胜感激。
查询下方的预期结果示例。
;CREATE TABLE RiskReadmits(MRN INT, VisitDate DATE, Category VARCHAR(15))
;CREATE CLUSTERED INDEX Risk_Readmits_Index ON RiskReadmits(VisitDate)
;INSERT RiskReadmits(MRN,VisitDate,CATEGORY)
VALUES
(1, '1/2/2016','Inpatient'),
(1, '1/5/2016','Inpatient'),
(1, '1/7/2016','Inpatient'),
(1, '1/8/2016','Inpatient'),
(1, '1/9/2016','Inpatient'),
(1, '2/4/2016','Inpatient'),
(1, '6/2/2016','Inpatient'),
(1, '6/3/2016','Inpatient'),
(1, '6/5/2016','Inpatient'),
(1, '6/6/2016','Inpatient'),
(1, '6/8/2016','Inpatient'),
(1, '7/1/2016','Inpatient'),
(1, '8/1/2016','Inpatient'),
(1, '8/4/2016','Inpatient'),
(1, '8/15/2016','Inpatient'),
(1, '8/18/2016','Inpatient'),
(1, '8/28/2016','Inpatient'),
(1, '10/12/2016','Inpatient'),
(1, '10/15/2016','Inpatient'),
(1, '11/17/2016','Inpatient'),
(1, '12/20/2016','Inpatient')
;WITH a AS (
SELECT
z1.VisitDate
, z1.MRN
, (SELECT MIN(VisitDate) FROM RiskReadmits WHERE VisitDate > DATEADD(day, 3, z1.VisitDate)) AS NextDay
FROM
RiskReadmits z1
WHERE
CATEGORY = 'Inpatient'
), a1 AS (
SELECT
MRN
, MIN(VisitDate) AS VisitDate
, MIN(NextDay) AS NextDay
FROM
a
GROUP BY
MRN
), b AS (
SELECT
VisitDate
, MRN
, NextDay
, 1 AS OrderRow
FROM
a1
UNION ALL
SELECT
a.VisitDate
, a.MRN
, a.NextDay
, b.OrderRow +1 AS OrderRow
FROM
a
JOIN b
ON a.VisitDate = b.NextDay
), c AS (
SELECT
MRN,
VisitDate
, (SELECT MAX(VisitDate) FROM b WHERE b1.VisitDate > VisitDate AND b.MRN = b1.MRN) AS PreviousVisitDate
FROM
b b1
)
SELECT distinct
c1.MRN,
c1.VisitDate
, CASE
WHEN DATEDIFF(day,c1.PreviousVisitDate,c1.VisitDate) < 30 THEN PreviousVisitDate
ELSE NULL
END AS ReAdmissionFrom
, CASE
WHEN DATEDIFF(day,c1.PreviousVisitDate,c1.VisitDate) < 30 THEN 3
ELSE 0
END AS Points
FROM
c c1
ORDER BY c1.MRN
预期结果:
MRN VisitDate ReAdmissionFrom Points
1 2016-01-02 NULL 0
1 2016-01-07 2016-01-02 3
1 2016-02-04 2016-01-07 3
1 2016-06-02 NULL 0
1 2016-06-06 2016-06-02 3
1 2016-07-01 2016-06-06 3
1 2016-08-01 NULL 0
1 2016-08-15 2016-08-01 3
1 2016-08-28 2016-08-15 3
1 2016-10-12 NULL 0
1 2016-11-17 NULL 0
1 2016-12-20 NULL 0
我要大胆猜测一下,你想把 b cte 改成 将 AND a.MRN = b.MRN 作为第二个 select 查询中的第二个条件,如下所示:
, b AS (
SELECT
VisitDate
, MRN
, NextDay
, 1 AS OrderRow
FROM
firstVisitAndFollowUp
UNION ALL
SELECT
a.VisitDate
, a.MRN
, a.NextDay
, b.OrderRow +1 AS OrderRow
FROM
visitsDistance3daysOrMore AS a
JOIN b
ON a.VisitDate = b.NextDay AND a.MRN = b.MRN
)
糟糕,我更改了几个 cte 的名称(post 弄乱了代码)
应该是这样的:
b AS ( SELECT 访问日期 , 核磁共振网络 , 明天 , 1 作为订单行 从 a1
UNION ALL
SELECT
a.VisitDate
, a.MRN
, a.NextDay
, b.OrderRow +1 AS OrderRow
FROM
a AS a
JOIN b
ON a.VisitDate = b.NextDay AND a.MRN = b.MRN
)