Select 每组第一条和最后一条记录

Select first and last record of each group

我需要return每组集合的第一条和最后一条记录。

数据:

Code    Close       Time
USD     146116      2022-04-03 04:00:00.00 +00:00
EUR     241789      2022-03-27 17:00:00.00 +00:00
EUR     241807      2022-03-27 08:00:00.00 +00:00
USD     141800      2022-03-27 08:00:00.00 +00:00
USD     140809      2022-03-27 07:00:00.00 +00:00

T-SQL:

SELECT 
     [Code]
    ,DATEADD(dd, 0, DATEDIFF(dd, 0, [Time]))
    ,FIRST_VALUE([Close]) OVER (ORDER BY [Time] ASC) AS [Open] -- Column 'Time' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
    ,MAX([Close]) AS [High]
    ,MIN([Close]) AS [Low]
    ,LAST_VALUE([Close]) OVER (ORDER BY [Time] ASC) AS [Close] -- Column 'Time' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
FROM [CurrencyIntradayHistories]
GROUP BY [Code],
         DATEADD(dd, 0, DATEDIFF(dd, 0, [Time]))
ORDER BY [Time] ASC

想要的结果:

Code    Open    High    Low     Close   Time
EUR     241807  241807  241789  241789  2022-03-27
USD     140809  141800  140809  141800  2022-03-27
USD     146116  146116  146116  146116  2022-04-03

在整个数据集上使用 window 函数 MIN()、MAX() 和 FIRST_VALUE(),而不是在按代码和日期分组时得到的结果上:

SELECT DISTINCT
       [Code],
       FIRST_VALUE([Close]) OVER (PARTITION BY [Code], CONVERT(date, [Time]) ORDER BY [Time] ASC) AS [Open],
       MAX([Close]) OVER (PARTITION BY [Code], CONVERT(date, [Time])) AS [High],
       MIN([Close]) OVER (PARTITION BY [Code], CONVERT(date, [Time])) AS [Low],
       FIRST_VALUE([Close]) OVER (PARTITION BY [Code], CONVERT(date, [Time]) ORDER BY [Time] DESC) AS [Close],
       CONVERT(date, [Time]) [Time]
FROM [CurrencyIntradayHistories];

参见demo

这里不能直接使用FIRST_VALUE,因为它是window函数,不是聚合函数。

您需要将其嵌入 subquery/derived table 并对其使用聚合。

此外,它需要一个 PARTITION BY 子句,以及 ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING

SELECT 
     cih.Code
    ,cih.Date
    ,MIN(cih.OpenPerDay) AS [Open]
    ,MAX(cih.[Close]) AS High
    ,MIN(cih.[Close]) AS Low
    ,MIN(cih.ClosePerDay) AS [Close]
FROM (
    SELECT
         *
        ,CAST(cih.Time AS date) AS Date
        ,FIRST_VALUE(cih.[Close]) OVER (PARTITION BY cih.Code, CAST(cih.Time AS date)
            ORDER BY cih.Time ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS OpenPerDay
        ,LAST_VALUE(cih.[Close]) OVER (PARTITION BY cih.Code, CAST(cih.Time AS date)
            ORDER BY cih.Time ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS ClosePerDay
    FROM CurrencyIntradayHistories cih
) cih
GROUP BY
  cih.Code,
  cih.Date
ORDER BY
  cih.Date;

稍微更高效的版本使用 ROW_NUMBERLEAD 来查找开始行和结束行。

SELECT 
     cih.Code
    ,cih.Date
    ,MIN(CASE WHEN cih.rn = 1 THEN cih.[Close] END) AS [Open]
    ,MAX(cih.[Close]) AS High
    ,MIN(cih.[Close]) AS Low
    ,MIN(CASE WHEN cih.NextClose IS NULL THEN cih.[Close] END) AS [Close]
FROM (
    SELECT
         *
        ,CAST(cih.Time AS date) AS Date
        ,ROW_NUMBER() OVER (PARTITION BY cih.Code, CAST(cih.Time AS date) ORDER BY cih.Time) AS rn
        ,LEAD(cih.[Close]) OVER (PARTITION BY cih.Code, CAST(cih.Time AS date)
            ORDER BY cih.Time) AS NextClose
    FROM CurrencyIntradayHistories cih
) cih
GROUP BY
  cih.Code,
  cih.Date
ORDER BY
  cih.Date;

db<>fiddle