DATEDIFF 基于行中的下一个填充列
DATEDIFF based on next populated column in row
我正在 SQL 服务器中进行查询,它给我的结果集看起来像这样:
ID
DaysInState
DaysInState2
DaysInState3
DaysInState4
1
2022-04-01
2022-04-07
NULL
NULL
2
NULL
2022-04-09
NULL
NULL
3
2022-04-11
2022-04-15
NULL
2022-04-18
4
2022-04-11
NULL
NULL
2022-04-18
我需要计算给定项目在给定状态下花费的天数。我面临的挑战是 'looking ahead' 行。以第 1 行为例,这些值将是以下值:
- 州内天数:6 (
DATEDIFF(day, '2022-04-11', '2022-04-07')
)
- DaysInState2: 12 (
DATEDIFF(day, '2022-04-07', GETDATE())
)
- DaysInState3: NULL
- DaysInState4: NULL
这里具有挑战性的部分是,对于每一行中的每一列,我必须查看参考列右侧的所有列,以查看是否存在要在 DATEDIFF
中使用的日期。如果在引用列的右侧未找到日期,则使用 GETDATE()
。下面的 table 显示了结果集的样子:
ID
DaysInState
DaysInState2
DaysInState3
DaysInState4
1
6
12
NULL
NULL
2
NULL
10
NULL
NULL
3
4
3
NULL
1
4
7
NULL
NULL
1
我可以为每一列编写相当复杂的 CASE...WHEN
语句,这样
SELECT
CASE
WHEN DaysInState IS NOT NULL AND DaysInState2 IS NOT NULL THEN DateDiff(day, DaysInState, DaysInState2)
WHEN DaysInState IS NOT NULL AND DaysInState2 IS NULL AND DaysInState3 IS NOT NULL THEN DateDiff(day, DaysInState, DaysInState3)
...
END
...
然而,当添加/删除状态时,这不是很容易维护。是否有更动态的方法来解决这个问题,它不涉及冗长的 CASE
语句,或者只是一般的“更好”的方法,也许我没有看到?
如果可以调整生成结果集的查询,我想推荐一种新方法。一个优点是它可以处理额外的 DayInState 变量 (5,6,7,...)。
重写您的查询,使您的结果包含三列:一列用于 ID,一列用于“DayInState”编号,一列用于日期。也就是说,没有返回 NULL 值。将结果集与不同的 ID、一个非常大的“DayInState”数字和 GETDATE() 的结果结合起来。然后您可以使用 DATEDIFF() 和 LAG() 来查看下一个日期。
这是 SQL 服务器中使用您的数据的工作示例:
begin
declare @temp table (id int,state_num int,dt date)
insert into @temp values
(1,1,'2022-04-01'),
(1,2,'2022-04-07'),
(2,2,'2022-04-09'),
(3,1,'2022-04-11'),
(3,2,'2022-04-15'),
(3,3,'2022-04-18'),
(4,1,'2022-04-11'),
(4,4,'2022-04-18')
select t.id,t.state_num,DATEDIFF(day,t.dt,LAG(t.dt,1,GETDATE()) over(partition by t.id order by t.state_num desc))
from
(select * from @temp
union (select distinct id,999 as state_num, GETDATE() as dt from @temp) ) t
where t.state_num!=999
order by t.id,t.state_num
end
IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[DateTest]') AND type in (N'U'))
DROP TABLE [dbo].[DateTest]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[DateTest](
[Id] [int] IDENTITY(1,1) NOT NULL,
[DaysInState] [date] NULL,
[DaysInState2] [date] NULL,
[DaysInState3] [date] NULL,
[DaysInState4] [date] NULL,
CONSTRAINT [PK_DateTest] PRIMARY KEY CLUSTERED
(
[Id] ASC
)WITH (STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
GO
INSERT INTO [dbo].[DateTest] ([DaysInState],[DaysInState2],[DaysInState3],[DaysInState4]) VALUES
('2022-04-01','2022-04-07',NULL,NULL),
(NULL,'2022-04-09',NULL,NULL),
('2022-04-11','2022-04-15',NULL,'2022-04-18'),
('2022-04-11',NULL,NULL,'2022-04-18');
GO
SELECT [ID],[DaysInState],[DaysInState2],[DaysInState3],[DaysInState4] FROM dbo.DateTest
使用嵌套的 ISNULL 检查下一列或传递 GETDATE()。
使用变量意味着您可以根据需要更改日期。
DECLARE @theDate date = GETDATE()
SELECT
[DaysInState] =DATEDIFF(day,[DaysInState], ISNULL([DaysInState2],ISNULL([DaysInState3],ISNULL([DaysInState4],@theDate))))
,[DaysInState2] =DATEDIFF(day,[DaysInState2],ISNULL([DaysInState3],ISNULL([DaysInState4],@theDate)))
,[DaysInState3] =DATEDIFF(day,[DaysInState3],ISNULL([DaysInState4],@theDate))
,[DaysInState4] =DATEDIFF(day,[DaysInState4],@theDate)
FROM dbo.DateTest
您的原始查询
SELECT
[DaysInState]=CASE
WHEN DaysInState IS NOT NULL AND DaysInState2 IS NOT NULL THEN DateDiff(day, DaysInState, DaysInState2)
WHEN DaysInState IS NOT NULL AND DaysInState2 IS NULL AND DaysInState3 IS NOT NULL THEN DateDiff(day, DaysInState, DaysInState3)
END
FROM dbo.DateTest
COALESCE 函数允许多个参数,从左到右计算它们,返回第一个 non-null 值,无需嵌套:
Daysinstate1=
datediff(day,
Daysinstate1,
Coalesce(daysinstate2
,Daysinstate3
,Daysinstate4
,Getdate())
)
我正在 SQL 服务器中进行查询,它给我的结果集看起来像这样:
ID | DaysInState | DaysInState2 | DaysInState3 | DaysInState4 |
---|---|---|---|---|
1 | 2022-04-01 | 2022-04-07 | NULL | NULL |
2 | NULL | 2022-04-09 | NULL | NULL |
3 | 2022-04-11 | 2022-04-15 | NULL | 2022-04-18 |
4 | 2022-04-11 | NULL | NULL | 2022-04-18 |
我需要计算给定项目在给定状态下花费的天数。我面临的挑战是 'looking ahead' 行。以第 1 行为例,这些值将是以下值:
- 州内天数:6 (
DATEDIFF(day, '2022-04-11', '2022-04-07')
) - DaysInState2: 12 (
DATEDIFF(day, '2022-04-07', GETDATE())
) - DaysInState3: NULL
- DaysInState4: NULL
这里具有挑战性的部分是,对于每一行中的每一列,我必须查看参考列右侧的所有列,以查看是否存在要在 DATEDIFF
中使用的日期。如果在引用列的右侧未找到日期,则使用 GETDATE()
。下面的 table 显示了结果集的样子:
ID | DaysInState | DaysInState2 | DaysInState3 | DaysInState4 |
---|---|---|---|---|
1 | 6 | 12 | NULL | NULL |
2 | NULL | 10 | NULL | NULL |
3 | 4 | 3 | NULL | 1 |
4 | 7 | NULL | NULL | 1 |
我可以为每一列编写相当复杂的 CASE...WHEN
语句,这样
SELECT
CASE
WHEN DaysInState IS NOT NULL AND DaysInState2 IS NOT NULL THEN DateDiff(day, DaysInState, DaysInState2)
WHEN DaysInState IS NOT NULL AND DaysInState2 IS NULL AND DaysInState3 IS NOT NULL THEN DateDiff(day, DaysInState, DaysInState3)
...
END
...
然而,当添加/删除状态时,这不是很容易维护。是否有更动态的方法来解决这个问题,它不涉及冗长的 CASE
语句,或者只是一般的“更好”的方法,也许我没有看到?
如果可以调整生成结果集的查询,我想推荐一种新方法。一个优点是它可以处理额外的 DayInState 变量 (5,6,7,...)。
重写您的查询,使您的结果包含三列:一列用于 ID,一列用于“DayInState”编号,一列用于日期。也就是说,没有返回 NULL 值。将结果集与不同的 ID、一个非常大的“DayInState”数字和 GETDATE() 的结果结合起来。然后您可以使用 DATEDIFF() 和 LAG() 来查看下一个日期。
这是 SQL 服务器中使用您的数据的工作示例:
begin
declare @temp table (id int,state_num int,dt date)
insert into @temp values
(1,1,'2022-04-01'),
(1,2,'2022-04-07'),
(2,2,'2022-04-09'),
(3,1,'2022-04-11'),
(3,2,'2022-04-15'),
(3,3,'2022-04-18'),
(4,1,'2022-04-11'),
(4,4,'2022-04-18')
select t.id,t.state_num,DATEDIFF(day,t.dt,LAG(t.dt,1,GETDATE()) over(partition by t.id order by t.state_num desc))
from
(select * from @temp
union (select distinct id,999 as state_num, GETDATE() as dt from @temp) ) t
where t.state_num!=999
order by t.id,t.state_num
end
IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[DateTest]') AND type in (N'U'))
DROP TABLE [dbo].[DateTest]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[DateTest](
[Id] [int] IDENTITY(1,1) NOT NULL,
[DaysInState] [date] NULL,
[DaysInState2] [date] NULL,
[DaysInState3] [date] NULL,
[DaysInState4] [date] NULL,
CONSTRAINT [PK_DateTest] PRIMARY KEY CLUSTERED
(
[Id] ASC
)WITH (STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
GO
INSERT INTO [dbo].[DateTest] ([DaysInState],[DaysInState2],[DaysInState3],[DaysInState4]) VALUES
('2022-04-01','2022-04-07',NULL,NULL),
(NULL,'2022-04-09',NULL,NULL),
('2022-04-11','2022-04-15',NULL,'2022-04-18'),
('2022-04-11',NULL,NULL,'2022-04-18');
GO
SELECT [ID],[DaysInState],[DaysInState2],[DaysInState3],[DaysInState4] FROM dbo.DateTest
使用嵌套的 ISNULL 检查下一列或传递 GETDATE()。 使用变量意味着您可以根据需要更改日期。
DECLARE @theDate date = GETDATE()
SELECT
[DaysInState] =DATEDIFF(day,[DaysInState], ISNULL([DaysInState2],ISNULL([DaysInState3],ISNULL([DaysInState4],@theDate))))
,[DaysInState2] =DATEDIFF(day,[DaysInState2],ISNULL([DaysInState3],ISNULL([DaysInState4],@theDate)))
,[DaysInState3] =DATEDIFF(day,[DaysInState3],ISNULL([DaysInState4],@theDate))
,[DaysInState4] =DATEDIFF(day,[DaysInState4],@theDate)
FROM dbo.DateTest
您的原始查询
SELECT
[DaysInState]=CASE
WHEN DaysInState IS NOT NULL AND DaysInState2 IS NOT NULL THEN DateDiff(day, DaysInState, DaysInState2)
WHEN DaysInState IS NOT NULL AND DaysInState2 IS NULL AND DaysInState3 IS NOT NULL THEN DateDiff(day, DaysInState, DaysInState3)
END
FROM dbo.DateTest
COALESCE 函数允许多个参数,从左到右计算它们,返回第一个 non-null 值,无需嵌套:
Daysinstate1=
datediff(day,
Daysinstate1,
Coalesce(daysinstate2
,Daysinstate3
,Daysinstate4
,Getdate())
)