U-Sql,没有得到笛卡尔积怎么加入

U-Sql, how do i join without getting cartesian product

我有一个大文件,每个 ID 每天都有行。 每个ID每天可以有不止一条记录,但只有最新的值有效。

DailyValues:

ID int,
date datetime,
version datetime,
value1 float,
value2 float,
value3 float,
value4 float,

I T-SQL 我会 select MAX(version) 并按 ID、日期分组,然后通过交叉应用加入值。

select
  B.*
from
(
   Select
     ID,
     Date,
     MAX(Version)
     From DailyValues D1
   group by
     ID,
     Date
) as A
CROSS APPLY (
   select top 1 *
   from DailyValues D2
   where D1.ID = D2.ID
   and D1.Date = D2.Date
   and D1.Version = D2.version
   order Version desc
) as B

文件太大,我无法在 T-sql 中完成。

我如何在 U-sql

中执行此操作

您可以先将 CSV 文件提取到行集中。解压后,可以select最新版本,如下:

@DailyValues = 
      EXTRACT ID int,
    date datetime,
    version datetime,
    value1 float,
    value2 float,
    value3 float,
    value4 float
       FROM "/Samples/Data/DailyValues.csv"
       USING Extractors.Csv(encoding: Encoding.[ASCII]);

SELECT ID, Date, Version
FROM
(
SELECT ID, Date,Version, ROW_Number() OVER(PARTITION BY ID, Date ORDER BY version DESC) AS rn
FROM @DailyValues) AS t
WHERE t.rn == 1;