当记录数翻倍时改进慢查询

Improve slow query when the number of the records doubles

我有一个查询 select 来自 table 的数据。

此 table 记录数量因用户输入而异。

当记录数在 60,000 左右或更少时,速度非常快(不到 2 分钟)。

但是当我将大约 120,000 条记录的数量翻倍时,需要一个多小时!然后我必须终止进程。

不知道为什么特别慢

我添加了很多索引,但还是太慢了。

这里是查询

DECLARE @MaxID AS bigint

SELECT @MaxId = MAX(iID) from TagsTemp
    
SELECT 
    RepID, tag, xmiid, ibegin, iend, 
    confidence, polarity, uncertainty, conditional, generic, historyOf,
    codingScheme, code, cui, /*U.tui,*/ preferredText , --ISNULL(tag2, tag3) AS tagValue, 
    ISNULL(ibegin2, ibegin3) AS ibeginValue, ISNULL(iend2, iend3) AS iendValue,
    dbo.RepGetValue(RepID, ibegin, iend, 
    ISNULL(ibegin2, ibegin3), ISNULL(iend2, iend3))
FROM 
    (SELECT DISTINCT 
         T.RepID, dbo.ShortTag(T.tag) AS tag, T.xmiid, 
         CAST(T.ibegin AS bigint) AS ibegin, CAST(T.iend AS bigint) AS iend, 
         T.confidence, T.polarity, T.uncertainty, T.conditional, T.generic, T.historyOf,
         dbo.ShortTag(U.codingScheme) AS codingScheme, U.code, U.cui, /*U.tui, */U.preferredText,
         dbo.ShortTag(L3.tag) AS tag2, L3.ibegin AS ibegin2, L3.iend AS iend2,
         dbo.ShortTag(M1.tag) AS tag3, M1.ibegin AS ibegin3, M1.iend AS iend3, 
         ROW_NUMBER() OVER(PARTITION BY T.RepID, T.tag, T.xmiid,
         CAST(T.ibegin AS bigint), CAST(T.iend AS bigint), 
         T.confidence, T.polarity, T.uncertainty, T.conditional, T.generic, T.historyOf,
         U.codingScheme, U.code, U.cui, /*U.tui,*/ U.preferredText , L3.tag, L3.ibegin, L3.iend 
     ORDER BY 
         T.RepID, T.xmiid, CAST(T.ibegin AS bigint), CAST(T.iend AS bigint), 
         CAST(M1.ibegin AS bigint), CAST(M1.iend AS bigint) DESC, 
         CASE M1.tag 
            WHEN 'textsem:Mandy' THEN 1 
            WHEN 'textsem:Franc' THEN 2 
            WHEN 'textsem:Roger' THEN 3 
            WHEN 'syntax:Numan' THEN 4 
            WHEN 'textsem:Danna' THEN 5 
            WHEN 'textsem:Rami' THEN 6 
         END) AS RowNo
     FROM 
         TagsTemp T 
     INNER JOIN 
         TagsTemp U ON T.RepID = U.RepID 
                    AND T.Tag IN ('textsem:Michael', 'textsem:Simon', 'textsem:Anna','textsem:Evan','textsem:Paul','textsem:Dines','textsem:Larry')
                    AND U.Tag = 'refsem:Usman'
                    AND T.ontologyConceptArr LIKE '%' + CAST(U.xmiid AS varchar(100)) + '%'
     LEFT OUTER JOIN 
         TagsTemp L1 ON T.tag = 'textsem:Larry' 
                     AND L1.tag = 'relation:ResultOfTextRelation'
                     AND T.RepID = L1.RepID 
                     AND T.LabValue = L1.xmiid 
                     AND ISNULL(L1.arg2, '') <> ''
     LEFT OUTER JOIN 
         TagsTemp L2 ON L1.tag = 'relation:ResultOfTextRelation' 
                     AND L2.tag = 'relation:RelationArgument' 
                     AND L1.RepID = L2.RepID 
                     AND L1.arg2 = L2.xmiid 
     LEFT OUTER JOIN 
         TagsTemp L3 ON L2.tag = 'relation:RelationArgument' 
                     AND  L3.tag IN ('syntax:Numan','textsem:Danna','textsem:Roger','textsem:Mandy', 'textsem:Franc','textsem:Rami')
                     AND L2.RepID = L3.RepID 
                     AND L2.argument = L3.xmiid
     LEFT OUTER JOIN 
         TagsTemp M1 ON T.RepID = M1.RepID 
                     AND T.tag IN ('textsem:Michael', 'textsem:Larry') 
                     AND M1.tag IN ('syntax:Numan','textsem:Danna','textsem:Roger', 'textsem:Mandy',/*'textsem:Rami', */ 'textsem:Franc')
                     AND CAST(M1.ibegin AS bigint) > CAST(T.iend AS bigint) 
                     AND CAST(M1.ibegin AS bigint) - CAST(T.iend AS bigint) < 4 
     WHERE 
         T.iID <= @MaxID) X
WHERE
    RowNo =    1
ORDER BY 
    RepID, tag, xmiid, CAST(ibegin AS bigint) , CAST(iend AS bigint) , 
    confidence, polarity, uncertainty, conditional, generic, historyOf,
    codingScheme, code, cui, /*U.tui,*/ preferredText , tag2, ibegin2, iend2,
    tag3,ibegin3, iend3

这是我运行 60,000

时的执行计划

https://www.brentozar.com/pastetheplan/?id=r1XHRQXpD

您有一个包含 37K 次查找的密钥查找 尝试删除索引“[NonClusteredIndex-20201223-150141]”并创建另一个索引:

CREATE INDEX NCI_TagsTemp_1 ON TagsTemp (iID, RepID, tag)
INCLUDE (xmiid, ibegin, iend, ontologyConceptArr, confidence, polarity, uncertainty, conditional, generic, historyOf, labValue)

我没有确切的答案,但我可以帮助您提高性能

在执行这个查询之前运行这两个语句

SET STATISTICS IO ON 
SET STATISTICS TIME ON 

现在运行您在消息选项卡中的查询和分析结果。

使用此站点以获得更多可读性http://statisticsparser.com/

尝试找到物理读多的地方,本站了解物理读和逻辑读https://vaishaligoilkar3322.medium.com/physical-and-logical-reads-in-sql-server-c6d62e65e359

现在您需要找到耗时的确切块并尝试将其分解。

尝试减少物理和逻辑读取并重复此步骤,直到找到根本原因。

这是提高查询性能的最佳方式。

希望对您有所帮助。