NOT IN 语句减慢了我的查询速度

Question

我的查询有问题。我这里有一个简单的例子来说明我的代码。

SELECT distinct ID 
FROM Table  
WHERE IteamNumber in (132,434,675) AND Year(DateCreated) = 2019
      AND ID NOT IN (
                     SELECT Distinct ID FROM Table  
                     WHERE IteamNumber in (132,434,675) AND DateCreated < '2019-01-01')

如您所见，我正在检索创建于 2019 年而非更早的唯一数据 ID。

select 语句工作正常，但是一旦我使用 NOT IN 语句，查询就可以轻松地进行 1 分钟以上。

我的另一个问题可能与 computer/server 性能有关，即运行 Microsoft Business Central 的 SQL 服务器？因为即使使用 (NOT IN) 语句，相同的查询毕竟也能完美运行，但那是在 Microsoft dynamics C5 SQL Server 中。

所以我的问题是我的查询有问题还是主要是服务器问题？

更新：这是一个真实的例子：检索 500 行需要 25 秒

Select count(distinct b.No_),'2014'
from [Line] c    
inner join [Header] a
on a.CollectionNo = c.CollectionNo
Inner join [Customer] b
on b.No_ = a.CustomerNo

where  c.No_ in('2101','2102','2103','2104','2105')
and year(Enrollmentdate)= 2014 
and(a.Resignationdate < '1754-01-01 00:00:00.000' OR a.Resignationdate >= '2014-12-31')


and NOT EXISTS(Select distinct x.No_
                 from [Line] c    
                 inner join [Header] a
                 on a.CollectionNo = c.CollectionNo
                 Inner join [Customer] x
                 on x.No_ = a.CustomerNo
                 where x.No_ = b.No_ and 
                       c.No_ in('2101','2102','2103','2104','2105')
                       and Enrollmentdate < '2014-01-01'
                       and(a.Resignationdate < '1754-01-01 00:00:00.000' OR a.Resignationdate > '2014-12-31'))

Answer 1

问题是因为你的 IN 语句，我认为最好避免任何 IN 语句而不是这个，用子查询创建 join 并使用过滤掉你的数据where 子句。

在 IN 语句的情况下，您 table 的每条记录都映射到子查询的所有记录，这肯定会减慢您的进程。

如果必须使用 IN 子句，则将其与 index 一起使用。为您尊重的列创建适当的索引，从而提高您的性能。

您可以使用 EXISTS 而不是 IN 来提高查询的性能。

EXISTS 的例子是：

SELECT distinct ID 
FROM Table AS T 
WHERE IteamNumber in (132,434,675) AND Year(DateCreated) = 2019
      AND NOT EXISTS (
                     SELECT Distinct ID FROM Table AS T2 
                     WHERE T1.ID=T2.ID 
                     AND IteamNumber in (132,434,675) AND DateCreated < '2019-01-01' )

Answer 2

我通常更喜欢 JOINs 而不是 INs，你可以获得相同的结果，但引擎往往能够更好地优化它。

您将主查询 (T1) 与 IN 子查询 (T2) 合并，然后过滤 T2.ID 为 null，确保您没有找到任何符合这些条件的记录。

SELECT distinct T1.ID 
FROM Table T1 
     LEFT JOIN Table T2 on T2.ID = T1.ID AND 
                     T2.IteamNumber in (132,434,675) AND T2.DateCreated < '2019-01-01'
WHERE T1.IteamNumber in (132,434,675) AND Year(T1.DateCreated) = 2019 AND
      T2.ID is null

更新：这是根据您的真实查询更新的提案。由于您的子查询具有内部联接，因此我创建了一个 CTE，因此您可以左联接该子查询。功能是相同的，您将主查询与子查询连接起来，并且 return 只有在子查询中找不到匹配记录的行。

with previous as (
  Select x.No_
  from [Line] c    
       inner join [Header] a on a.CollectionNo = c.CollectionNo
       inner join [Customer] x on x.No_ = a.CustomerNo
  where     c.No_ in ('2101','2102','2103','2104','2105')
        and Enrollmentdate < '2014-01-01'
        and (a.Resignationdate < '1754-01-01 00:00:00.000' OR a.Resignationdate > '2014-12-31'))
)
Select count(distinct b.No_),'2014'
from [Line] c    
     inner join [Header] a on a.CollectionNo = c.CollectionNo
     inner join [Customer] b on b.No_ = a.CustomerNo
     left join previous p on p.No_ = b.No_
where    c.No_ in ('2101','2102','2103','2104','2105')
     and year(Enrollmentdate)= 2014 
     and (a.Resignationdate < '1754-01-01 00:00:00.000' OR a.Resignationdate >= '2014-12-31')
     and p.No_ is null

Answer 3

如果我理解正确，您可以将查询编写为带有 HAVING 子句的 GROUP BY 查询：

SELECT ID 
FROM t
WHERE IteamNumber in (132, 434, 675)
GROUP BY ID
HAVING MIN(DateCreated) >= '20190101' -- no row earlier than 2019
AND    MIN(DateCreated) <  '20200101' -- at least one row less than 2020

这将删除存在较早记录的行。您可以通过创建覆盖索引进一步提高性能：

CREATE INDEX IX_t_0001 ON t (ID) INCLUDE (IteamNumber, DateCreated)

NOT IN 语句减慢了我的查询速度

NOT IN statement is slowing down my query

sql-server

notin

dynamics-business-central