MSSQL 2005 如何对这些数据进行分组

MSSQL 2005 How can i group these data

我在 SQL SERVER 2005

中有一个这样的 table
No WorkOrder StartDateTime     EndDateTime       
======================================================
1  WO111111  2019-01-01 07:00  2019-01-01 08:00  
2  WO111111  2019-01-01 08:00  2019-01-01 08:30  
3  WO222222  2019-01-01 08:30  2019-01-01 09:30  
4  WO222222  2019-01-01 09:30  2019-01-01 10:00  
6  WO222222  2019-01-01 10:00  2019-01-01 12:00 
7  WO111111  2019-01-01 12:00  2019-01-01 17:00  

怎样才能得到下面的table

WorkOrder StartDateTime     EndDateTime
============================================
WO111111  2019-01-01 07:00  2019-01-01 08:30
WO222222  2019-01-01 08:30  2019-01-01 12:00
WO111111  2019-01-01 12:00  2019-01-01 17:00

我尝试了 row_number() 和 rank(),但没有成功。

DECLARE @Tmp TABLE (No int, WorkOrder varchar(20), StartDateTime datetime, EndDateTime datetime)
insert into @Tmp values(1,'WO111111','2019-01-01 07:00','2019-01-01 08:00')
insert into @Tmp values(2,'WO111111','2019-01-01 08:00','2019-01-01 08:30')
insert into @Tmp values(3,'WO222222','2019-01-01 08:30','2019-01-01 09:30')
insert into @Tmp values(4,'WO222222','2019-01-01 09:30','2019-01-01 10:00')
insert into @Tmp values(5,'WO222222','2019-01-01 10:00','2019-01-01 12:00')
insert into @Tmp values(6,'WO111111','2019-01-01 12:00','2019-01-01 17:00')
select * from @Tmp;
select g,WorkOrder,min(StartDateTime)StartDateTime,Max(EndDateTime)EndDateTime
From(
  select rank()over(order by WorkOrder)as g,* from @Tmp
)a group by g,WorkOrder

您可以使用自连接和 SUM window 函数。首先确定 WorkOrder 值何时更改以便能够通过考虑 No 的顺序进行分组,然后只需使用 MINMAX 分组以粉碎日期间隔。

;WITH LaggedWorkOrder AS
(
    SELECT
        T1.WorkOrder,
        T1.StartDateTime,
        T1.EndDateTime,
        T1.No,
        WorkOrderChange = CASE 
            WHEN T2.WorkOrder = T1.WorkOrder THEN 0 
            ELSE 1 END
    FROM
        @Tmp AS T1
        LEFT JOIN @Tmp AS T2 ON T1.No - 1 = T2.No
),
WorkOrderGroups AS
(
    SELECT
        L.WorkOrder,
        L.StartDateTime,
        L.EndDateTime,
        L.No,
        WorkOrderGroup = SUM(L.WorkOrderChange) OVER (ORDER BY L.No ASC)
    FROM
        LaggedWorkOrder AS L
)
SELECT
    W.WorkOrder,
    StartDateTime = MIN(W.StartDateTime),
    EndDateTime = MAX(W.EndDateTime)
FROM
    WorkOrderGroups AS W
GROUP BY
    W.WorkOrderGroup,
    W.WorkOrder
ORDER BY
    W.WorkOrderGroup

结果:

WorkOrder   StartDateTime               EndDateTime
WO111111    2019-01-01 07:00:00.000     2019-01-01 08:30:00.000
WO222222    2019-01-01 08:30:00.000     2019-01-01 12:00:00.000
WO111111    2019-01-01 12:00:00.000     2019-01-01 17:00:00.000

中间 CTE 结果如下:

LaggedWorkOrder(每当 WorkOrder 更改值时查看):

WorkOrder   StartDateTime               EndDateTime                 No  WorkOrderChange
WO111111    2019-01-01 07:00:00.000     2019-01-01 08:00:00.000     1   1
WO111111    2019-01-01 08:00:00.000     2019-01-01 08:30:00.000     2   0
WO222222    2019-01-01 08:30:00.000     2019-01-01 09:30:00.000     3   1
WO222222    2019-01-01 09:30:00.000     2019-01-01 10:00:00.000     4   0
WO222222    2019-01-01 10:00:00.000     2019-01-01 12:00:00.000     5   0
WO111111    2019-01-01 12:00:00.000     2019-01-01 17:00:00.000     6   1

WorkOrderGroups(为 MAX/MIN 生成分组值):

WorkOrder   StartDateTime               EndDateTime                 No  WorkOrderGroup
WO111111    2019-01-01 07:00:00.000     2019-01-01 08:00:00.000     1   1
WO111111    2019-01-01 08:00:00.000     2019-01-01 08:30:00.000     2   1
WO222222    2019-01-01 08:30:00.000     2019-01-01 09:30:00.000     3   2
WO222222    2019-01-01 09:30:00.000     2019-01-01 10:00:00.000     4   2
WO222222    2019-01-01 10:00:00.000     2019-01-01 12:00:00.000     5   2
WO111111    2019-01-01 12:00:00.000     2019-01-01 17:00:00.000     6   3

PD:请考虑升级服务器版本,2005 已于 2016 年 4 月结束支持。

Now knowing you're on SQL Server 2005, you'll need to rely on outer apply to make the arbitrary join required here to determine the relative previous record.

你没有在此处概述断言。但是从输出中猜测您正在寻找每个组的第一个 WorkOrder,在这种情况下似乎是 运行 次,直到出现不同的 WorkOrder

下面的方法使用 apply 使用 top 1 获取之前的记录,特别是 outer apply 以确保我们不会丢失第一条记录(把它想象成 left join).

apply 迭代器几乎总是被忽视并且经常被遗忘。但是当您需要迭代而没有像键这样的具体连接谓词时,它是一个非常 强大的工具。我已经在大表上使用这种方法来解决 "top n problem",发现它有时比内置的表现更好。

请注意,我选择 No 作为决胜局。

CREATE TABLE #WorkOrders (
   No             INT IDENTITY PRIMARY KEY
  ,WorkOrder      VARCHAR(8) NOT NULL
  ,StartDateTime  DATETIME NOT NULL
  ,EndDateTime    DATETIME NOT NULL);

INSERT INTO #WorkOrders (WorkOrder, StartDateTime, EndDateTime)
VALUES  ('WO111111','20190101 07:00','20190101 08:00')
        ,('WO111111','20190101 08:00','20190101 08:30')
        ,('WO111111','20190101 08:30','20190101 09:30')
        ,('WO222222','20190101 08:30','20190101 09:30')
        ,('WO222222','20190101 09:30','20190101 10:00')
        ,('WO222222','20190101 10:00','20190101 12:30')
        ,('WO111111','20190101 12:00','20190101 12:30')

SELECT  wo.WorkOrder
     ,  wo.StartDateTime
     ,  wo.EndDateTime
  FROM  #WorkOrders AS wo 
        OUTER APPLY (
          SELECT  TOP(1)
                  * 
            FROM  #WorkOrders AS wo2 
           WHERE  wo2.StartDateTime < wo.StartDateTime 
           ORDER  BY wo2.StartDateTime DESC, No DESC
        ) AS prev
 WHERE prev.WorkOrder IS NULL
       OR prev.WorkOrder <> wo.WorkOrder

DROP TABLE #WorkOrders;

使用按最小值、最大值分组获得预期输出

SELECT WorkOrder, min(StartDateTime),max(EndDateTime) FROM `tb` group by WorkOrder

Output

==============

工单开始日期时间结束日期时间

WO111111  2019-01-01 07:00  2019-01-01 08:30
WO222222  2019-01-01 08:30  2019-01-01 12:00
WO111111  2019-01-01 12:00  2019-01-01 17:00