U-SQL 是否接受使用多个 FIRST_VALUE 来删除特定列中的重复项?
U-SQL Is it accepted to use more than one FIRST_VALUE to remove duplicates in specific columns?
我有一个 table,其中多行是重复的,因为两个日期列的值不同。
我想知道是否接受像这样在两列中使用 FIRST_VALUE,以删除指定列上的重复项:
SELECT
EmployeeName,
FIRST_VALUE(StartDateTime) OVER(ORDER BY UpdatedDateTime DESC) AS StartDateTime,
FIRST_VALUE(UpdatedDateTime) OVER(ORDER BY UpdatedDateTime DESC) AS UpdatedDateTime
FROM @Employees;
如果您需要删除某些字段的重复项,您需要使用 ROW_NUMBER()
和 CTE:
-- Sample data: dates duplicates
declare @t table (id int, dt date);
insert into @t values
(1, '2018-01-14'),
(1, '2018-01-14'),
(1, '2018-01-15'),
(1, '2018-01-15');
with cte as (
select *,
-- assign row number for each partition consisting of same date
row_number() over (partition by dt order by dt) as cnt
from @t
)
-- we're interested only in one row (i.e. first)
select id, dt from cte where cnt = 1;
/*
Output:
+-------+---------+
| id | dt |
+----+------------+
| 1 | 2018-01-14 |
|----|------------|
| 1 | 2018-01-15 |
+----+------------+
*/
我有一个 table,其中多行是重复的,因为两个日期列的值不同。
我想知道是否接受像这样在两列中使用 FIRST_VALUE,以删除指定列上的重复项:
SELECT
EmployeeName,
FIRST_VALUE(StartDateTime) OVER(ORDER BY UpdatedDateTime DESC) AS StartDateTime,
FIRST_VALUE(UpdatedDateTime) OVER(ORDER BY UpdatedDateTime DESC) AS UpdatedDateTime
FROM @Employees;
如果您需要删除某些字段的重复项,您需要使用 ROW_NUMBER()
和 CTE:
-- Sample data: dates duplicates
declare @t table (id int, dt date);
insert into @t values
(1, '2018-01-14'),
(1, '2018-01-14'),
(1, '2018-01-15'),
(1, '2018-01-15');
with cte as (
select *,
-- assign row number for each partition consisting of same date
row_number() over (partition by dt order by dt) as cnt
from @t
)
-- we're interested only in one row (i.e. first)
select id, dt from cte where cnt = 1;
/*
Output:
+-------+---------+
| id | dt |
+----+------------+
| 1 | 2018-01-14 |
|----|------------|
| 1 | 2018-01-15 |
+----+------------+
*/