效率：MONTH() 与 DATEDIFF()

Question

我有两个带日期的表，我想通过 INNER JOIN 连接它们。这些表通过 FK 相互连接，确保我在 Tabla A 上的记录及其在 Table B 上的相关记录在同一年。

长话短说 - 我想确保这两个日期在同一个月。如前所述， DATEDIFF() 在我的情况下没有逻辑优势 - 它永远不会给我 -12 或 12 因为年份与等式无关。我的结果将始终与 DATEDIFF 或 MONTH 相同（当然我测试过）。

有了这些假设 - 什么会更有效率？

    SELECT .... 
    FROM DatesA da 
    INNER JOIN DatesB db 
    ON MONTH(da.Date) = MONTH(db.Date) 
    AND [Rest of the join]

    SELECT .... 
    FROM DatesA da 
    INNER JOIN DatesB db 
    ON DATEDIFF(MM, da.Date, db.Date) = 0 
    AND [Rest of the join]

谢谢！

Answer 1

编辑 - 看起来 DateDiff 方法可以使用索引，因为它不是包装值的标量函数。在我的环境中对测试数据的快速比较表明 DateDiff 的效率将提高几倍。

Answer 2

性能比较

对我来说，使用包含 2508 条记录的数据集进行测试，这些记录的日期均匀分布在一年中，并将 table 与其自身相结合，datepart 的表现明显优于 datediff （datepart 和 month 之间的差异可以忽略不计，尽管 datepart 通常 ~1ms 更快。此测试是在 SQL 2008 R2 (SP3) 上完成的。完整代码共享如下：

--prep
create table #testDates (d date)

insert #testDates
select dateadd(dd,row_number() over (partition by 1 order by number) % 365,'2017-01-01')
from master.dbo.spt_values a --, master.dbo.spt_values --uncomment this for a larger test set

select @@VERSION --Microsoft SQL Server 2008 R2 (SP3) - 10.50.6529.0 (X64) 
go


--test statements
set statistics time on
select count(1) --return 1 so we're measuring query time; not the time to return the results
from #testDates a 
inner join #testDates b 
on month(a.d) = month(b.d)
set statistics time off

set statistics time on
select count(1) 
from #testDates a 
inner join #testDates b 
on datepart(month,a.d) = datepart(month,b.d)
set statistics time off

set statistics time on
select count(1) 
from #testDates a 
inner join #testDates b 
on datediff(MM,a.d,b.d) = 0
set statistics time off

--cleanup
go
drop table #testDates

时间分别为：5ms、4ms、3432ms。

也就是说，这只是对我的设置测试数据的测试...在不同情况下可能会有很大差异。

索引数据怎么样？

填充数据后添加索引提高了 datediff 的性能；虽然只到 3390ms；仍然远远落后于其他人。

create index ix_testDates_d on #testDates(d) --create the index after populating the data to ensure it's not fragmented

其他

使用 datepart/month 而不是 datediff 的另一个原因是这是更好的 self-documenting 代码；即它表明您正在寻找同一个月的日期；而不是它们之间的月数为 0 的日期（这是同一件事（除了几年）；但后者需要更多的时间来进行认知处理。
使用 datepart 而不是 month 的一个原因是 datepart 符合 ANSI。
但是 month 比 datepart 具有确定性函数的优势（参考：），由于某些原因 datepart 不是不！
也month更直观；即人们可以更快地理解。
datepart 和 month 之间的选择，考虑到可以忽略不计的性能差异，应该取决于您的其他要求 and/or 编码标准。

Answer 3

我将使用@JohnLBevan 之前的回答作为我的回答的基础

这只需要 1 毫秒。它是 sergable soltion 并在日期列中使用索引。

"trick" 是以前有一种日历 table（我即时创建的）具有每个月的第一天和最后一天。

create table #testDates (d date)

insert #testDates
select dateadd(dd,row_number() over (partition by 1 order by number) % 365,'2017-01-01')
from master.dbo.spt_values a --, master.dbo.spt_values --uncomment this for a larger test set

select @@VERSION --Microsoft SQL Server 2008 R2 (SP3) - 10.50.6529.0 (X64) 
go


create index ix_testDates_d on #testDates(d) 

--test statements
set statistics time on
select count(1) --return 1 so we're measuring query time; not the time to return the results
from #testDates a 
inner join #testDates b 
on month(a.d) = month(b.d)
set statistics time off

select min(d) iniDay,max(d) endDay into #months from #testDates
group by month(d)


set statistics time on
select count(1) --return 1 so we're measuring query time; not the time to return the results
from #testDates a 
inner join #months m
on a.d>= m.iniDay and a.d<=m.endDay
inner join #testDates b 
 on b.d>= m.iniDay and b.d<=m.endDay
set statistics time off


--cleanup
go
drop table #testDates 
drop table #months

时间为 4 毫秒，日历为 10 毫秒 table，1 毫秒。

对于 150.000 行

(150000 row(s) affected)

(1 row(s) affected)
SQL Server parse and compile time: 
   CPU time = 0 ms, elapsed time = 4 ms.

(1 row(s) affected)

 SQL Server Execution Times:
   CPU time = 141 ms,  elapsed time = 130 ms.

(12 row(s) affected)
SQL Server parse and compile time: 
   CPU time = 14 ms, elapsed time = 14 ms.

(1 row(s) affected)

 SQL Server Execution Times:
   CPU time = 47 ms,  elapsed time = 48 ms.

效率：MONTH() 与 DATEDIFF()

Efficiency: MONTH() vs. DATEDIFF()

sql-server

performance

datediff

datepart