SQL 服务器数据库删除重复趋势值,但保留第一个和最后一个
SQL Server database delete duplicates trend values, but leave first and last
我有一个充满温度值的大型数据库。问题是系统保存了很多重复值,现在 SQL 服务器数据库已满(Express 版)。
部分数据样本,使用查询
SELECT [PointID], [Time], [Data]
FROM TrendSamples
WHERE pointid = 13
ORDER BY time DESC
结果:
PointID Time Data
---------------------------------------------
13 2020-01-02 07:29:01.077 22,2999992370605
13 2020-01-02 07:28:50.937 22,5
13 2020-01-02 07:28:05.230 22,2999992370605
13 2020-01-02 07:27:55.090 22,3999996185303
13 2020-01-02 07:27:04.510 22,3999996185303
13 2020-01-02 07:26:13.443 22,3999996185303
13 2020-01-02 07:25:22.580 22,3999996185303
13 2020-01-02 07:24:31.340 22,3999996185303
13 2020-01-02 07:23:40.370 22,3999996185303
13 2020-01-02 07:22:49.460 22,3999996185303
13 2020-01-02 07:21:59.160 22,3999996185303
13 2020-01-02 07:21:08.483 22,3999996185303
13 2020-01-02 07:20:17.713 22,3999996185303
13 2020-01-02 07:19:26.710 22,3999996185303
13 2020-01-02 07:18:35.283 22,3999996185303
13 2020-01-02 07:17:44.250 22,3999996185303
13 2020-01-02 07:16:53.463 22,3999996185303
13 2020-01-02 07:16:02.367 22,3999996185303
13 2020-01-02 07:15:11.083 22,3999996185303
13 2020-01-02 07:14:19.987 22,3999996185303
13 2020-01-02 07:13:29.230 22,3999996185303
13 2020-01-02 07:12:38.197 22,3999996185303
13 2020-01-02 07:11:47.957 22,3999996185303
13 2020-01-02 07:10:57.033 22,3999996185303
13 2020-01-02 07:10:06.293 22,3999996185303
13 2020-01-02 07:09:15.183 22,3999996185303
13 2020-01-02 07:08:24.083 22,3999996185303
13 2020-01-02 07:07:33.237 22,3999996185303
13 2020-01-02 07:06:42.140 22,3999996185303
13 2020-01-02 07:05:51.557 22,3999996185303
13 2020-01-02 07:05:00.787 22,3999996185303
13 2020-01-02 07:04:09.707 22,3999996185303
13 2020-01-02 07:03:18.970 22,3999996185303
13 2020-01-02 07:02:28.043 22,3999996185303
13 2020-01-02 07:01:36.930 22,3999996185303
13 2020-01-02 07:00:46.317 22,3999996185303
13 2020-01-02 06:59:55.390 22,3999996185303
13 2020-01-02 06:59:04.403 22,3999996185303
13 2020-01-02 06:58:13.103 22,3999996185303
13 2020-01-02 06:58:01.247 22,5
如您所见,有很多重复数据,其中数据在不同时间之间根本没有变化。有没有办法删除所有重复数据,但在值更改前后保留第一行和最后一行。
结果我要的是这个
PointID Time Data
---------------------------------------------
13 2020-01-02 07:29:01.077 22,2999992370605
13 2020-01-02 07:28:50.937 22,5
13 2020-01-02 07:28:05.230 22,2999992370605
13 2020-01-02 07:27:55.090 22,3999996185303
13 2020-01-02 06:58:13.103 22,3999996185303
13 2020-01-02 06:58:01.247 22,5
您可以为此使用可更新的 CTE:
with cte as (
select
t.*,
lag(data) over(partition by pointID order by time) lag_data,
lead(data) over(partition by pointID order by time) lead_data
from mytable t
)
delete from cte where (data = lag_data and data = lead_data)
CTE 使用 lag()
和 lead()
在具有相同 pointID
的前后行中引入 data
的值,按 time
。然后外部查询删除 data
与上一条和下一条记录相同的记录。
假设您的 table 具有自动生成的带有唯一编号的 ID 列。
在 cte 表达式中使用 RowNumber() 两次,我们可以使用自动生成的 Id 列按 asc 和 desc 顺序分区数据并删除行。
with result as (
select *,
Row_Number() over(partition by Data order by ID) rownumber1,
Row_Number() over(partition by Data order by ID desc) rownumber2
from TrendSamples
)
delete from result where rownumber1 > 1 and rownumber2 > 1
我有一个充满温度值的大型数据库。问题是系统保存了很多重复值,现在 SQL 服务器数据库已满(Express 版)。
部分数据样本,使用查询
SELECT [PointID], [Time], [Data]
FROM TrendSamples
WHERE pointid = 13
ORDER BY time DESC
结果:
PointID Time Data
---------------------------------------------
13 2020-01-02 07:29:01.077 22,2999992370605
13 2020-01-02 07:28:50.937 22,5
13 2020-01-02 07:28:05.230 22,2999992370605
13 2020-01-02 07:27:55.090 22,3999996185303
13 2020-01-02 07:27:04.510 22,3999996185303
13 2020-01-02 07:26:13.443 22,3999996185303
13 2020-01-02 07:25:22.580 22,3999996185303
13 2020-01-02 07:24:31.340 22,3999996185303
13 2020-01-02 07:23:40.370 22,3999996185303
13 2020-01-02 07:22:49.460 22,3999996185303
13 2020-01-02 07:21:59.160 22,3999996185303
13 2020-01-02 07:21:08.483 22,3999996185303
13 2020-01-02 07:20:17.713 22,3999996185303
13 2020-01-02 07:19:26.710 22,3999996185303
13 2020-01-02 07:18:35.283 22,3999996185303
13 2020-01-02 07:17:44.250 22,3999996185303
13 2020-01-02 07:16:53.463 22,3999996185303
13 2020-01-02 07:16:02.367 22,3999996185303
13 2020-01-02 07:15:11.083 22,3999996185303
13 2020-01-02 07:14:19.987 22,3999996185303
13 2020-01-02 07:13:29.230 22,3999996185303
13 2020-01-02 07:12:38.197 22,3999996185303
13 2020-01-02 07:11:47.957 22,3999996185303
13 2020-01-02 07:10:57.033 22,3999996185303
13 2020-01-02 07:10:06.293 22,3999996185303
13 2020-01-02 07:09:15.183 22,3999996185303
13 2020-01-02 07:08:24.083 22,3999996185303
13 2020-01-02 07:07:33.237 22,3999996185303
13 2020-01-02 07:06:42.140 22,3999996185303
13 2020-01-02 07:05:51.557 22,3999996185303
13 2020-01-02 07:05:00.787 22,3999996185303
13 2020-01-02 07:04:09.707 22,3999996185303
13 2020-01-02 07:03:18.970 22,3999996185303
13 2020-01-02 07:02:28.043 22,3999996185303
13 2020-01-02 07:01:36.930 22,3999996185303
13 2020-01-02 07:00:46.317 22,3999996185303
13 2020-01-02 06:59:55.390 22,3999996185303
13 2020-01-02 06:59:04.403 22,3999996185303
13 2020-01-02 06:58:13.103 22,3999996185303
13 2020-01-02 06:58:01.247 22,5
如您所见,有很多重复数据,其中数据在不同时间之间根本没有变化。有没有办法删除所有重复数据,但在值更改前后保留第一行和最后一行。
结果我要的是这个
PointID Time Data
---------------------------------------------
13 2020-01-02 07:29:01.077 22,2999992370605
13 2020-01-02 07:28:50.937 22,5
13 2020-01-02 07:28:05.230 22,2999992370605
13 2020-01-02 07:27:55.090 22,3999996185303
13 2020-01-02 06:58:13.103 22,3999996185303
13 2020-01-02 06:58:01.247 22,5
您可以为此使用可更新的 CTE:
with cte as (
select
t.*,
lag(data) over(partition by pointID order by time) lag_data,
lead(data) over(partition by pointID order by time) lead_data
from mytable t
)
delete from cte where (data = lag_data and data = lead_data)
CTE 使用 lag()
和 lead()
在具有相同 pointID
的前后行中引入 data
的值,按 time
。然后外部查询删除 data
与上一条和下一条记录相同的记录。
假设您的 table 具有自动生成的带有唯一编号的 ID 列。
在 cte 表达式中使用 RowNumber() 两次,我们可以使用自动生成的 Id 列按 asc 和 desc 顺序分区数据并删除行。
with result as (
select *,
Row_Number() over(partition by Data order by ID) rownumber1,
Row_Number() over(partition by Data order by ID desc) rownumber2
from TrendSamples
)
delete from result where rownumber1 > 1 and rownumber2 > 1