SQL 服务器数据库删除重复趋势值,但保留第一个和最后一个

SQL Server database delete duplicates trend values, but leave first and last

我有一个充满温度值的大型数据库。问题是系统保存了很多重复值,现在 SQL 服务器数据库已满(Express 版)。

部分数据样本,使用查询

SELECT [PointID], [Time], [Data] 
FROM TrendSamples 
WHERE pointid = 13 
ORDER BY time DESC

结果:

PointID Time       Data
---------------------------------------------
13  2020-01-02 07:29:01.077 22,2999992370605
13  2020-01-02 07:28:50.937 22,5
13  2020-01-02 07:28:05.230 22,2999992370605
13  2020-01-02 07:27:55.090 22,3999996185303 
13  2020-01-02 07:27:04.510 22,3999996185303
13  2020-01-02 07:26:13.443 22,3999996185303
13  2020-01-02 07:25:22.580 22,3999996185303
13  2020-01-02 07:24:31.340 22,3999996185303
13  2020-01-02 07:23:40.370 22,3999996185303
13  2020-01-02 07:22:49.460 22,3999996185303
13  2020-01-02 07:21:59.160 22,3999996185303
13  2020-01-02 07:21:08.483 22,3999996185303
13  2020-01-02 07:20:17.713 22,3999996185303
13  2020-01-02 07:19:26.710 22,3999996185303
13  2020-01-02 07:18:35.283 22,3999996185303
13  2020-01-02 07:17:44.250 22,3999996185303
13  2020-01-02 07:16:53.463 22,3999996185303
13  2020-01-02 07:16:02.367 22,3999996185303
13  2020-01-02 07:15:11.083 22,3999996185303
13  2020-01-02 07:14:19.987 22,3999996185303
13  2020-01-02 07:13:29.230 22,3999996185303
13  2020-01-02 07:12:38.197 22,3999996185303
13  2020-01-02 07:11:47.957 22,3999996185303
13  2020-01-02 07:10:57.033 22,3999996185303
13  2020-01-02 07:10:06.293 22,3999996185303
13  2020-01-02 07:09:15.183 22,3999996185303
13  2020-01-02 07:08:24.083 22,3999996185303
13  2020-01-02 07:07:33.237 22,3999996185303
13  2020-01-02 07:06:42.140 22,3999996185303
13  2020-01-02 07:05:51.557 22,3999996185303
13  2020-01-02 07:05:00.787 22,3999996185303
13  2020-01-02 07:04:09.707 22,3999996185303
13  2020-01-02 07:03:18.970 22,3999996185303
13  2020-01-02 07:02:28.043 22,3999996185303
13  2020-01-02 07:01:36.930 22,3999996185303
13  2020-01-02 07:00:46.317 22,3999996185303
13  2020-01-02 06:59:55.390 22,3999996185303
13  2020-01-02 06:59:04.403 22,3999996185303
13  2020-01-02 06:58:13.103 22,3999996185303
13  2020-01-02 06:58:01.247 22,5

如您所见,有很多重复数据,其中数据在不同时间之间根本没有变化。有没有办法删除所有重复数据,但在值更改前后保留第一行和最后一行。

结果我要的是这个

PointID Time    Data
---------------------------------------------
13  2020-01-02 07:29:01.077 22,2999992370605
13  2020-01-02 07:28:50.937 22,5
13  2020-01-02 07:28:05.230 22,2999992370605
13  2020-01-02 07:27:55.090 22,3999996185303 
13  2020-01-02 06:58:13.103 22,3999996185303
13  2020-01-02 06:58:01.247 22,5

您可以为此使用可更新的 CTE:

with cte as (
    select 
        t.*, 
        lag(data) over(partition by pointID order by time) lag_data,
        lead(data) over(partition by pointID order by time) lead_data
    from mytable t
)
delete from cte where (data = lag_data and data = lead_data)

CTE 使用 lag()lead() 在具有相同 pointID 的前后行中引入 data 的值,按 time。然后外部查询删除 data 与上一条和下一条记录相同的记录。

假设您的 table 具有自动生成的带有唯一编号的 ID 列。

在 cte 表达式中使用 RowNumber() 两次,我们可以使用自动生成的 Id 列按 asc 和 desc 顺序分区数据并删除行。

with result as (
        select *, 
        Row_Number() over(partition by Data order by ID) rownumber1,
        Row_Number() over(partition by Data  order by ID desc) rownumber2
        from TrendSamples 
    )
    delete from result where rownumber1 > 1 and rownumber2 > 1