NOT IN 语句减慢了我的查询速度
NOT IN statement is slowing down my query
我的查询有问题。我这里有一个简单的例子来说明我的代码。
SELECT distinct ID
FROM Table
WHERE IteamNumber in (132,434,675) AND Year(DateCreated) = 2019
AND ID NOT IN (
SELECT Distinct ID FROM Table
WHERE IteamNumber in (132,434,675) AND DateCreated < '2019-01-01')
如您所见,我正在检索创建于 2019 年而非更早的唯一数据 ID。
select 语句工作正常,但是一旦我使用 NOT IN 语句,查询就可以轻松地进行 1 分钟以上。
我的另一个问题可能与 computer/server 性能有关,即 运行 Microsoft Business Central 的 SQL 服务器?因为即使使用 (NOT IN) 语句,相同的查询毕竟也能完美运行,但那是在 Microsoft dynamics C5 SQL Server 中。
所以我的问题是我的查询有问题还是主要是服务器问题?
更新:这是一个真实的例子:检索 500 行需要 25 秒
Select count(distinct b.No_),'2014'
from [Line] c
inner join [Header] a
on a.CollectionNo = c.CollectionNo
Inner join [Customer] b
on b.No_ = a.CustomerNo
where c.No_ in('2101','2102','2103','2104','2105')
and year(Enrollmentdate)= 2014
and(a.Resignationdate < '1754-01-01 00:00:00.000' OR a.Resignationdate >= '2014-12-31')
and NOT EXISTS(Select distinct x.No_
from [Line] c
inner join [Header] a
on a.CollectionNo = c.CollectionNo
Inner join [Customer] x
on x.No_ = a.CustomerNo
where x.No_ = b.No_ and
c.No_ in('2101','2102','2103','2104','2105')
and Enrollmentdate < '2014-01-01'
and(a.Resignationdate < '1754-01-01 00:00:00.000' OR a.Resignationdate > '2014-12-31'))
问题是因为你的 IN
语句,我认为最好避免任何 IN
语句而不是这个,用子查询创建 join
并使用过滤掉你的数据where
子句。
在 IN
语句的情况下,您 table 的每条记录都映射到子查询的所有记录,这肯定会减慢您的进程。
如果必须使用 IN
子句,则将其与 index
一起使用。为您尊重的列创建适当的索引,从而提高您的性能。
您可以使用 EXISTS
而不是 IN
来提高查询的性能。
EXISTS
的例子是:
SELECT distinct ID
FROM Table AS T
WHERE IteamNumber in (132,434,675) AND Year(DateCreated) = 2019
AND NOT EXISTS (
SELECT Distinct ID FROM Table AS T2
WHERE T1.ID=T2.ID
AND IteamNumber in (132,434,675) AND DateCreated < '2019-01-01' )
我通常更喜欢 JOINs 而不是 INs,你可以获得相同的结果,但引擎往往能够更好地优化它。
您将主查询 (T1) 与 IN 子查询 (T2) 合并,然后过滤 T2.ID 为 null,确保您没有找到任何符合这些条件的记录。
SELECT distinct T1.ID
FROM Table T1
LEFT JOIN Table T2 on T2.ID = T1.ID AND
T2.IteamNumber in (132,434,675) AND T2.DateCreated < '2019-01-01'
WHERE T1.IteamNumber in (132,434,675) AND Year(T1.DateCreated) = 2019 AND
T2.ID is null
更新:这是根据您的真实查询更新的提案。由于您的子查询具有内部联接,因此我创建了一个 CTE,因此您可以左联接该子查询。功能是相同的,您将主查询与子查询连接起来,并且 return 只有在子查询中找不到匹配记录的行。
with previous as (
Select x.No_
from [Line] c
inner join [Header] a on a.CollectionNo = c.CollectionNo
inner join [Customer] x on x.No_ = a.CustomerNo
where c.No_ in ('2101','2102','2103','2104','2105')
and Enrollmentdate < '2014-01-01'
and (a.Resignationdate < '1754-01-01 00:00:00.000' OR a.Resignationdate > '2014-12-31'))
)
Select count(distinct b.No_),'2014'
from [Line] c
inner join [Header] a on a.CollectionNo = c.CollectionNo
inner join [Customer] b on b.No_ = a.CustomerNo
left join previous p on p.No_ = b.No_
where c.No_ in ('2101','2102','2103','2104','2105')
and year(Enrollmentdate)= 2014
and (a.Resignationdate < '1754-01-01 00:00:00.000' OR a.Resignationdate >= '2014-12-31')
and p.No_ is null
如果我理解正确,您可以将查询编写为带有 HAVING
子句的 GROUP BY
查询:
SELECT ID
FROM t
WHERE IteamNumber in (132, 434, 675)
GROUP BY ID
HAVING MIN(DateCreated) >= '20190101' -- no row earlier than 2019
AND MIN(DateCreated) < '20200101' -- at least one row less than 2020
这将删除存在较早记录的行。您可以通过创建覆盖索引进一步提高性能:
CREATE INDEX IX_t_0001 ON t (ID) INCLUDE (IteamNumber, DateCreated)
我的查询有问题。我这里有一个简单的例子来说明我的代码。
SELECT distinct ID
FROM Table
WHERE IteamNumber in (132,434,675) AND Year(DateCreated) = 2019
AND ID NOT IN (
SELECT Distinct ID FROM Table
WHERE IteamNumber in (132,434,675) AND DateCreated < '2019-01-01')
如您所见,我正在检索创建于 2019 年而非更早的唯一数据 ID。
select 语句工作正常,但是一旦我使用 NOT IN 语句,查询就可以轻松地进行 1 分钟以上。
我的另一个问题可能与 computer/server 性能有关,即 运行 Microsoft Business Central 的 SQL 服务器?因为即使使用 (NOT IN) 语句,相同的查询毕竟也能完美运行,但那是在 Microsoft dynamics C5 SQL Server 中。
所以我的问题是我的查询有问题还是主要是服务器问题?
更新:这是一个真实的例子:检索 500 行需要 25 秒
Select count(distinct b.No_),'2014'
from [Line] c
inner join [Header] a
on a.CollectionNo = c.CollectionNo
Inner join [Customer] b
on b.No_ = a.CustomerNo
where c.No_ in('2101','2102','2103','2104','2105')
and year(Enrollmentdate)= 2014
and(a.Resignationdate < '1754-01-01 00:00:00.000' OR a.Resignationdate >= '2014-12-31')
and NOT EXISTS(Select distinct x.No_
from [Line] c
inner join [Header] a
on a.CollectionNo = c.CollectionNo
Inner join [Customer] x
on x.No_ = a.CustomerNo
where x.No_ = b.No_ and
c.No_ in('2101','2102','2103','2104','2105')
and Enrollmentdate < '2014-01-01'
and(a.Resignationdate < '1754-01-01 00:00:00.000' OR a.Resignationdate > '2014-12-31'))
问题是因为你的 IN
语句,我认为最好避免任何 IN
语句而不是这个,用子查询创建 join
并使用过滤掉你的数据where
子句。
在 IN
语句的情况下,您 table 的每条记录都映射到子查询的所有记录,这肯定会减慢您的进程。
如果必须使用 IN
子句,则将其与 index
一起使用。为您尊重的列创建适当的索引,从而提高您的性能。
您可以使用 EXISTS
而不是 IN
来提高查询的性能。
EXISTS
的例子是:
SELECT distinct ID
FROM Table AS T
WHERE IteamNumber in (132,434,675) AND Year(DateCreated) = 2019
AND NOT EXISTS (
SELECT Distinct ID FROM Table AS T2
WHERE T1.ID=T2.ID
AND IteamNumber in (132,434,675) AND DateCreated < '2019-01-01' )
我通常更喜欢 JOINs 而不是 INs,你可以获得相同的结果,但引擎往往能够更好地优化它。
您将主查询 (T1) 与 IN 子查询 (T2) 合并,然后过滤 T2.ID 为 null,确保您没有找到任何符合这些条件的记录。
SELECT distinct T1.ID
FROM Table T1
LEFT JOIN Table T2 on T2.ID = T1.ID AND
T2.IteamNumber in (132,434,675) AND T2.DateCreated < '2019-01-01'
WHERE T1.IteamNumber in (132,434,675) AND Year(T1.DateCreated) = 2019 AND
T2.ID is null
更新:这是根据您的真实查询更新的提案。由于您的子查询具有内部联接,因此我创建了一个 CTE,因此您可以左联接该子查询。功能是相同的,您将主查询与子查询连接起来,并且 return 只有在子查询中找不到匹配记录的行。
with previous as (
Select x.No_
from [Line] c
inner join [Header] a on a.CollectionNo = c.CollectionNo
inner join [Customer] x on x.No_ = a.CustomerNo
where c.No_ in ('2101','2102','2103','2104','2105')
and Enrollmentdate < '2014-01-01'
and (a.Resignationdate < '1754-01-01 00:00:00.000' OR a.Resignationdate > '2014-12-31'))
)
Select count(distinct b.No_),'2014'
from [Line] c
inner join [Header] a on a.CollectionNo = c.CollectionNo
inner join [Customer] b on b.No_ = a.CustomerNo
left join previous p on p.No_ = b.No_
where c.No_ in ('2101','2102','2103','2104','2105')
and year(Enrollmentdate)= 2014
and (a.Resignationdate < '1754-01-01 00:00:00.000' OR a.Resignationdate >= '2014-12-31')
and p.No_ is null
如果我理解正确,您可以将查询编写为带有 HAVING
子句的 GROUP BY
查询:
SELECT ID
FROM t
WHERE IteamNumber in (132, 434, 675)
GROUP BY ID
HAVING MIN(DateCreated) >= '20190101' -- no row earlier than 2019
AND MIN(DateCreated) < '20200101' -- at least one row less than 2020
这将删除存在较早记录的行。您可以通过创建覆盖索引进一步提高性能:
CREATE INDEX IX_t_0001 ON t (ID) INCLUDE (IteamNumber, DateCreated)