重新排序记录的聪明方法?
Clever way to reorder records?
我有一个 table 包含以下数据:
VehicleID Time ContextID Value
--------------- ----------------------- -------------------------- -----
359586015047188 2021-02-01 07:27:14.777 SafeProtectDeviceConnected False
359586015047188 2021-02-01 07:53:38.000 SafeProtectProtectionLevel 5
359586015047188 2021-02-01 07:53:47.777 SafeProtectDeviceConnected True
359586015047188 2021-02-01 10:24:20.777 SafeProtectDeviceConnected False
359586015047188 2021-02-01 10:26:46.000 SafeProtectProtectionLevel 5
359586015047188 2021-02-01 10:26:55.777 SafeProtectDeviceConnected True
359586015047188 2021-02-01 10:43:53.777 SafeProtectDeviceConnected False
359586015047188 2021-02-01 10:46:01.000 SafeProtectProtectionLevel 5
359586015047188 2021-02-01 10:46:09.777 SafeProtectDeviceConnected True
359586015047188 2021-02-01 11:02:16.777 SafeProtectDeviceConnected False
359586015047188 2021-02-01 14:39:41.777 SafeProtectProtectionLevel 5
359586015047188 2021-02-01 14:39:42.777 SafeProtectDeviceConnected True
359586015047188 2021-02-01 14:55:48.777 SafeProtectDeviceConnected False
359586015047188 2021-02-02 07:52:12.777 SafeProtectDeviceConnected True
359586015047188 2021-02-02 07:52:12.777 SafeProtectProtectionLevel 5
359586015047188 2021-02-02 07:52:32.777 SafeProtectDeviceConnected False
359586015047188 2021-02-02 07:53:57.000 SafeProtectProtectionLevel 5
359586015047188 2021-02-02 07:54:10.777 SafeProtectDeviceConnected True
如您所见,数据已及时插入。但有时,时间不正确。我需要一个查询来修复它。
如果数据一致,我将始终有一行,其中 ContextID 的值为 SafeProtectDeviceConnected
,Value 的值为 True
,后跟一行 ContextID 的值为 SafeProtectProtectionLevel
无论值列的值如何。
我发现我有 LAG analytics function available to access the previous row values and that I can also put one or more CASE in an order by clause
那么在对之前的结果集应用修复查询后的正确结果将是:
VehicleID Time ContextID Value
--------------- ----------------------- -------------------------- -----
359586015047188 2021-02-01 07:27:14.777 SafeProtectDeviceConnected False
359586015047188 2021-02-01 07:53:47.777 SafeProtectDeviceConnected True
359586015047188 2021-02-01 07:53:38.000 SafeProtectProtectionLevel 5
359586015047188 2021-02-01 10:24:20.777 SafeProtectDeviceConnected False
359586015047188 2021-02-01 10:26:55.777 SafeProtectDeviceConnected True
359586015047188 2021-02-01 10:26:46.000 SafeProtectProtectionLevel 5
359586015047188 2021-02-01 10:43:53.777 SafeProtectDeviceConnected False
359586015047188 2021-02-01 10:46:09.777 SafeProtectDeviceConnected True
359586015047188 2021-02-01 10:46:01.000 SafeProtectProtectionLevel 5
359586015047188 2021-02-01 11:02:16.777 SafeProtectDeviceConnected False
359586015047188 2021-02-01 14:39:42.777 SafeProtectDeviceConnected True
359586015047188 2021-02-01 14:39:41.777 SafeProtectProtectionLevel 5
359586015047188 2021-02-01 14:55:48.777 SafeProtectDeviceConnected False
359586015047188 2021-02-02 07:52:12.777 SafeProtectDeviceConnected True
359586015047188 2021-02-02 07:52:12.777 SafeProtectProtectionLevel 5
359586015047188 2021-02-02 07:52:32.777 SafeProtectDeviceConnected False
359586015047188 2021-02-02 07:54:10.777 SafeProtectDeviceConnected True
359586015047188 2021-02-02 07:53:57.000 SafeProtectProtectionLevel 5
基本上,如果我们查看上面重新排序的结果集,特别是从上到下的值列,我们应该有一个 True
值,后跟一个或多个带有数值的行,然后是一个 False
值
到目前为止我尝试过的是使用 ORDER BY t.VehiculeID, /*dbo.ContextDetail.Time*/ CASE WHEN t.ContextID='SafeProtectDeviceConnected' AND t.Value='True' THEN 1 END, CASE WHEN t.ContextID='SafeProtectProtectionLevel' THEN 2 END, CASE WHEN t.ContextID='SafeProtectDeviceConnected' AND t.Value='False' THEN 3 END
但它(显然)给了我所有 False
行,然后是所有 numeric rows
,然后是剩余的 True
值。
这个问题是空白和孤岛吗?
解决这个问题的正确方法是什么?
只要必须分组的行之间的时间不重叠,那么你可以使用Time
列对行进行分组,并使用case
表达式来确定一个内的排序单组。下面的解决方案使用通用 table 表达式计算这些新列,以便于选择和排序。
示例数据
create table log
(
VehicleID bigint,
Time datetime,
ContextID nvarchar(50),
Value nvarchar(10)
);
insert into log (VehicleID, Time, ContextID, Value) values
(359586015047188, '2021-02-01 07:27:14.777', 'SafeProtectDeviceConnected', 'False'),
(359586015047188, '2021-02-01 07:53:38.000', 'SafeProtectProtectionLevel', '5'),
(359586015047188, '2021-02-01 07:53:47.777', 'SafeProtectDeviceConnected', 'True'),
(359586015047188, '2021-02-01 10:24:20.777', 'SafeProtectDeviceConnected', 'False'),
(359586015047188, '2021-02-01 10:26:46.000', 'SafeProtectProtectionLevel', '5'),
(359586015047188, '2021-02-01 10:26:55.777', 'SafeProtectDeviceConnected', 'True'),
(359586015047188, '2021-02-01 10:43:53.777', 'SafeProtectDeviceConnected', 'False'),
(359586015047188, '2021-02-01 10:46:01.000', 'SafeProtectProtectionLevel', '5'),
(359586015047188, '2021-02-01 10:46:09.777', 'SafeProtectDeviceConnected', 'True'),
(359586015047188, '2021-02-01 11:02:16.777', 'SafeProtectDeviceConnected', 'False'),
(359586015047188, '2021-02-01 14:39:41.777', 'SafeProtectProtectionLevel', '5'),
(359586015047188, '2021-02-01 14:39:42.777', 'SafeProtectDeviceConnected', 'True'),
(359586015047188, '2021-02-01 14:55:48.777', 'SafeProtectDeviceConnected', 'False'),
(359586015047188, '2021-02-02 07:52:12.777', 'SafeProtectDeviceConnected', 'True'),
(359586015047188, '2021-02-02 07:52:12.777', 'SafeProtectProtectionLevel', '5'),
(359586015047188, '2021-02-02 07:52:32.777', 'SafeProtectDeviceConnected', 'False'),
(359586015047188, '2021-02-02 07:53:57.000', 'SafeProtectProtectionLevel', '5'),
(359586015047188, '2021-02-02 07:54:10.777', 'SafeProtectDeviceConnected', 'True');
解决方案
备注:你可以稍微缩小case表达式。为清楚起见,此处完整写出。
with cte as
(
select l.VehicleID,
l.Time,
l.ContextID,
l.Value,
(row_number() over(order by l.Time)-1)/3 as GroupNum,
case
when l.ContextID = 'SafeProtectDeviceConnected' and l.Value = 'False' then 1
when l.ContextID = 'SafeProtectDeviceConnected' and l.Value = 'True' then 2
when l.ContextID = 'SafeProtectProtectionLevel' then 3
end as GroupSort
from log l
)
select cte.VehicleID,
cte.Time,
cte.ContextID,
cte.Value
from cte
order by cte.GroupNum,
cte.GroupSort;
结果
VehicleID Time ContextID Value
--------------- ----------------------- -------------------------- -----
359586015047188 2021-02-01 07:27:14.777 SafeProtectDeviceConnected False
359586015047188 2021-02-01 07:53:47.777 SafeProtectDeviceConnected True
359586015047188 2021-02-01 07:53:38.000 SafeProtectProtectionLevel 5
359586015047188 2021-02-01 10:24:20.777 SafeProtectDeviceConnected False
359586015047188 2021-02-01 10:26:55.777 SafeProtectDeviceConnected True
359586015047188 2021-02-01 10:26:46.000 SafeProtectProtectionLevel 5
359586015047188 2021-02-01 10:43:53.777 SafeProtectDeviceConnected False
359586015047188 2021-02-01 10:46:09.777 SafeProtectDeviceConnected True
359586015047188 2021-02-01 10:46:01.000 SafeProtectProtectionLevel 5
359586015047188 2021-02-01 11:02:16.777 SafeProtectDeviceConnected False
359586015047188 2021-02-01 14:39:42.777 SafeProtectDeviceConnected True
359586015047188 2021-02-01 14:39:41.777 SafeProtectProtectionLevel 5
359586015047188 2021-02-01 14:55:48.777 SafeProtectDeviceConnected False
359586015047188 2021-02-02 07:52:12.777 SafeProtectDeviceConnected True
359586015047188 2021-02-02 07:52:12.777 SafeProtectProtectionLevel 5
359586015047188 2021-02-02 07:52:32.777 SafeProtectDeviceConnected False
359586015047188 2021-02-02 07:54:10.777 SafeProtectDeviceConnected True
359586015047188 2021-02-02 07:53:57.000 SafeProtectProtectionLevel 5
Fiddle 查看实际情况。
试试这个:
ORDER BY VehicleID, CAST(CONVERT(NVARCHAR(8), [Time], 112) AS BIGINT)*10000 + DATEPART(HOUR, [Time])* 100 + DATEPART(MINUTE, [Time]), CASE WHEN ContextID = 'SafeProtectDeviceConnected' AND [Value] = 'False' THEN 1 WHEN ContextID = 'SafeProtectDeviceConnected' AND [Value] = 'True' THEN 2 ELSE 3 END
与您的方法的主要区别:首先,我详细说明了我的订单中小于一分钟的所有内容,因为您提到过,那个时间可能有误。然而,据我所知(之后)时间似乎还可以但不是唯一的 - 所以按时间排序也应该可以解决问题。其次,我创建了一个 case 语句来考虑上下文和价值,而不是每列一个 case - 你想按两列的组合排序,而不是分别按每一列排序。所以组合 Connected + False 得到 1,Connected + True 得到 2,其他所有东西得到 3。
在您的查询中,您创建了三个案例,其中 return 一个值或 NULL(CASE... END
没有其他),所以毕竟您添加了三个值来排序。
我有一个 table 包含以下数据:
VehicleID Time ContextID Value
--------------- ----------------------- -------------------------- -----
359586015047188 2021-02-01 07:27:14.777 SafeProtectDeviceConnected False
359586015047188 2021-02-01 07:53:38.000 SafeProtectProtectionLevel 5
359586015047188 2021-02-01 07:53:47.777 SafeProtectDeviceConnected True
359586015047188 2021-02-01 10:24:20.777 SafeProtectDeviceConnected False
359586015047188 2021-02-01 10:26:46.000 SafeProtectProtectionLevel 5
359586015047188 2021-02-01 10:26:55.777 SafeProtectDeviceConnected True
359586015047188 2021-02-01 10:43:53.777 SafeProtectDeviceConnected False
359586015047188 2021-02-01 10:46:01.000 SafeProtectProtectionLevel 5
359586015047188 2021-02-01 10:46:09.777 SafeProtectDeviceConnected True
359586015047188 2021-02-01 11:02:16.777 SafeProtectDeviceConnected False
359586015047188 2021-02-01 14:39:41.777 SafeProtectProtectionLevel 5
359586015047188 2021-02-01 14:39:42.777 SafeProtectDeviceConnected True
359586015047188 2021-02-01 14:55:48.777 SafeProtectDeviceConnected False
359586015047188 2021-02-02 07:52:12.777 SafeProtectDeviceConnected True
359586015047188 2021-02-02 07:52:12.777 SafeProtectProtectionLevel 5
359586015047188 2021-02-02 07:52:32.777 SafeProtectDeviceConnected False
359586015047188 2021-02-02 07:53:57.000 SafeProtectProtectionLevel 5
359586015047188 2021-02-02 07:54:10.777 SafeProtectDeviceConnected True
如您所见,数据已及时插入。但有时,时间不正确。我需要一个查询来修复它。
如果数据一致,我将始终有一行,其中 ContextID 的值为 SafeProtectDeviceConnected
,Value 的值为 True
,后跟一行 ContextID 的值为 SafeProtectProtectionLevel
无论值列的值如何。
我发现我有 LAG analytics function available to access the previous row values and that I can also put one or more CASE in an order by clause
那么在对之前的结果集应用修复查询后的正确结果将是:
VehicleID Time ContextID Value
--------------- ----------------------- -------------------------- -----
359586015047188 2021-02-01 07:27:14.777 SafeProtectDeviceConnected False
359586015047188 2021-02-01 07:53:47.777 SafeProtectDeviceConnected True
359586015047188 2021-02-01 07:53:38.000 SafeProtectProtectionLevel 5
359586015047188 2021-02-01 10:24:20.777 SafeProtectDeviceConnected False
359586015047188 2021-02-01 10:26:55.777 SafeProtectDeviceConnected True
359586015047188 2021-02-01 10:26:46.000 SafeProtectProtectionLevel 5
359586015047188 2021-02-01 10:43:53.777 SafeProtectDeviceConnected False
359586015047188 2021-02-01 10:46:09.777 SafeProtectDeviceConnected True
359586015047188 2021-02-01 10:46:01.000 SafeProtectProtectionLevel 5
359586015047188 2021-02-01 11:02:16.777 SafeProtectDeviceConnected False
359586015047188 2021-02-01 14:39:42.777 SafeProtectDeviceConnected True
359586015047188 2021-02-01 14:39:41.777 SafeProtectProtectionLevel 5
359586015047188 2021-02-01 14:55:48.777 SafeProtectDeviceConnected False
359586015047188 2021-02-02 07:52:12.777 SafeProtectDeviceConnected True
359586015047188 2021-02-02 07:52:12.777 SafeProtectProtectionLevel 5
359586015047188 2021-02-02 07:52:32.777 SafeProtectDeviceConnected False
359586015047188 2021-02-02 07:54:10.777 SafeProtectDeviceConnected True
359586015047188 2021-02-02 07:53:57.000 SafeProtectProtectionLevel 5
基本上,如果我们查看上面重新排序的结果集,特别是从上到下的值列,我们应该有一个 True
值,后跟一个或多个带有数值的行,然后是一个 False
值
到目前为止我尝试过的是使用 ORDER BY t.VehiculeID, /*dbo.ContextDetail.Time*/ CASE WHEN t.ContextID='SafeProtectDeviceConnected' AND t.Value='True' THEN 1 END, CASE WHEN t.ContextID='SafeProtectProtectionLevel' THEN 2 END, CASE WHEN t.ContextID='SafeProtectDeviceConnected' AND t.Value='False' THEN 3 END
但它(显然)给了我所有 False
行,然后是所有 numeric rows
,然后是剩余的 True
值。
这个问题是空白和孤岛吗?
解决这个问题的正确方法是什么?
只要必须分组的行之间的时间不重叠,那么你可以使用Time
列对行进行分组,并使用case
表达式来确定一个内的排序单组。下面的解决方案使用通用 table 表达式计算这些新列,以便于选择和排序。
示例数据
create table log
(
VehicleID bigint,
Time datetime,
ContextID nvarchar(50),
Value nvarchar(10)
);
insert into log (VehicleID, Time, ContextID, Value) values
(359586015047188, '2021-02-01 07:27:14.777', 'SafeProtectDeviceConnected', 'False'),
(359586015047188, '2021-02-01 07:53:38.000', 'SafeProtectProtectionLevel', '5'),
(359586015047188, '2021-02-01 07:53:47.777', 'SafeProtectDeviceConnected', 'True'),
(359586015047188, '2021-02-01 10:24:20.777', 'SafeProtectDeviceConnected', 'False'),
(359586015047188, '2021-02-01 10:26:46.000', 'SafeProtectProtectionLevel', '5'),
(359586015047188, '2021-02-01 10:26:55.777', 'SafeProtectDeviceConnected', 'True'),
(359586015047188, '2021-02-01 10:43:53.777', 'SafeProtectDeviceConnected', 'False'),
(359586015047188, '2021-02-01 10:46:01.000', 'SafeProtectProtectionLevel', '5'),
(359586015047188, '2021-02-01 10:46:09.777', 'SafeProtectDeviceConnected', 'True'),
(359586015047188, '2021-02-01 11:02:16.777', 'SafeProtectDeviceConnected', 'False'),
(359586015047188, '2021-02-01 14:39:41.777', 'SafeProtectProtectionLevel', '5'),
(359586015047188, '2021-02-01 14:39:42.777', 'SafeProtectDeviceConnected', 'True'),
(359586015047188, '2021-02-01 14:55:48.777', 'SafeProtectDeviceConnected', 'False'),
(359586015047188, '2021-02-02 07:52:12.777', 'SafeProtectDeviceConnected', 'True'),
(359586015047188, '2021-02-02 07:52:12.777', 'SafeProtectProtectionLevel', '5'),
(359586015047188, '2021-02-02 07:52:32.777', 'SafeProtectDeviceConnected', 'False'),
(359586015047188, '2021-02-02 07:53:57.000', 'SafeProtectProtectionLevel', '5'),
(359586015047188, '2021-02-02 07:54:10.777', 'SafeProtectDeviceConnected', 'True');
解决方案
备注:你可以稍微缩小case表达式。为清楚起见,此处完整写出。
with cte as
(
select l.VehicleID,
l.Time,
l.ContextID,
l.Value,
(row_number() over(order by l.Time)-1)/3 as GroupNum,
case
when l.ContextID = 'SafeProtectDeviceConnected' and l.Value = 'False' then 1
when l.ContextID = 'SafeProtectDeviceConnected' and l.Value = 'True' then 2
when l.ContextID = 'SafeProtectProtectionLevel' then 3
end as GroupSort
from log l
)
select cte.VehicleID,
cte.Time,
cte.ContextID,
cte.Value
from cte
order by cte.GroupNum,
cte.GroupSort;
结果
VehicleID Time ContextID Value
--------------- ----------------------- -------------------------- -----
359586015047188 2021-02-01 07:27:14.777 SafeProtectDeviceConnected False
359586015047188 2021-02-01 07:53:47.777 SafeProtectDeviceConnected True
359586015047188 2021-02-01 07:53:38.000 SafeProtectProtectionLevel 5
359586015047188 2021-02-01 10:24:20.777 SafeProtectDeviceConnected False
359586015047188 2021-02-01 10:26:55.777 SafeProtectDeviceConnected True
359586015047188 2021-02-01 10:26:46.000 SafeProtectProtectionLevel 5
359586015047188 2021-02-01 10:43:53.777 SafeProtectDeviceConnected False
359586015047188 2021-02-01 10:46:09.777 SafeProtectDeviceConnected True
359586015047188 2021-02-01 10:46:01.000 SafeProtectProtectionLevel 5
359586015047188 2021-02-01 11:02:16.777 SafeProtectDeviceConnected False
359586015047188 2021-02-01 14:39:42.777 SafeProtectDeviceConnected True
359586015047188 2021-02-01 14:39:41.777 SafeProtectProtectionLevel 5
359586015047188 2021-02-01 14:55:48.777 SafeProtectDeviceConnected False
359586015047188 2021-02-02 07:52:12.777 SafeProtectDeviceConnected True
359586015047188 2021-02-02 07:52:12.777 SafeProtectProtectionLevel 5
359586015047188 2021-02-02 07:52:32.777 SafeProtectDeviceConnected False
359586015047188 2021-02-02 07:54:10.777 SafeProtectDeviceConnected True
359586015047188 2021-02-02 07:53:57.000 SafeProtectProtectionLevel 5
Fiddle 查看实际情况。
试试这个:
ORDER BY VehicleID, CAST(CONVERT(NVARCHAR(8), [Time], 112) AS BIGINT)*10000 + DATEPART(HOUR, [Time])* 100 + DATEPART(MINUTE, [Time]), CASE WHEN ContextID = 'SafeProtectDeviceConnected' AND [Value] = 'False' THEN 1 WHEN ContextID = 'SafeProtectDeviceConnected' AND [Value] = 'True' THEN 2 ELSE 3 END
与您的方法的主要区别:首先,我详细说明了我的订单中小于一分钟的所有内容,因为您提到过,那个时间可能有误。然而,据我所知(之后)时间似乎还可以但不是唯一的 - 所以按时间排序也应该可以解决问题。其次,我创建了一个 case 语句来考虑上下文和价值,而不是每列一个 case - 你想按两列的组合排序,而不是分别按每一列排序。所以组合 Connected + False 得到 1,Connected + True 得到 2,其他所有东西得到 3。
在您的查询中,您创建了三个案例,其中 return 一个值或 NULL(CASE... END
没有其他),所以毕竟您添加了三个值来排序。