为什么这两个 SQL 查询的效率差异如此之大?

Why are these two SQL queries so different in efficiency?

我必须使用 SQL 进行实习,虽然我知道它的要点,但我并没有真正的编程背景,也不知道是什么让代码变得高效等等。

查询 #1

SELECT DISTINCT 
    c.[STAT], c.[EVENT], f.[STAT], f.[EVENT]
FROM
    (SELECT *
     FROM 
         (SELECT 
              *, 
              ROW_NUMBER() OVER (PARTITION BY [ID] ORDER BY [PROCDT], [PROCTIME]) AS a
          FROM 
              TABLE) AS b
    ) AS c
LEFT JOIN 
    (SELECT 
         *
     FROM 
         (SELECT 
              *, 
              ROW_NUMBER() OVER (PARTITION BY [ID] ORDER BY [PROCDT], [PROCTIME]) AS d
          FROM 
              TABLE) AS e
         ) AS f ON c.[ID] = f.[ID] AND a = d - 1
ORDER BY 
    c.[STAT], c.[EVENT], f.[STAT], f.[EVENT]

查询#2

SELECT DISTINCT 
    b.[STAT], b.[EVENT], d.[STAT], d.[EVENT]
FROM
    (SELECT 
         *, 
         ROW_NUMBER() OVER (PARTITION BY [ID] ORDER BY [PROCDT], [PROCTIME]) AS a
     FROM TABLE) AS b
LEFT JOIN 
    (SELECT 
         *, 
         ROW_NUMBER() OVER (PARTITION BY [ID] ORDER BY [PROCDT], [PROCTIME]) AS c
     FROM TABLE) AS d ON b.[ID] = f.[ID] AND a = c - 1
ORDER BY 
    b.[STAT], b.[EVENT], d.[STAT], d.[EVENT]

查询 #1 和 #2 return 相同的结果,这是预期的,但是查询 #1 的 运行 时间大约为 5 秒,而查询 #2 的时间 运行时间大约1分35秒。换句话说,第二个查询 运行 比第一个多花了 1.5 分钟,我真的很想知道为什么。

编写此查询的正确方法是 lead()。我很确定不需要 select distinct,所以这就是你想要的:

SELECT stat, event,
       LEAD(stat) OVER (PARTITION BY ID, ORDER BY PROCDT, PROCTIME) as next_stat,
       LEAD(event) OVER (PARTITION BY ID, ORDER BY PROCDT, PROCTIME) as next_event
FROM TABLE t
ORDER BY stat, event;

你写的两个查询在SQL服务器中应该是一样的。显然,额外的子查询混淆了优化器。您需要了解执行计划才能更好地理解这一点。