使用 T-SQL window 函数从 1 分钟数据中检索 5 分钟平均值
Use T-SQL window functions to retrieve 5-minute averages from 1-minute data
我有一个数据库 table,其中包含证券的一分钟开盘价、收盘价、高价、低价和成交量值。我使用的是 SQL Server 2017,但 2019 RC 是一个选项。
我正在尝试找到一个高效的 SQL 服务器查询,可以将这些查询聚合成 5 分钟 windows,其中:
- Open = window
的第一个 Open 值
- 收盘价 = window
的最后收盘价
- High = window
的最大高值
- 低 = window
的最小低值
- 交易量 = window
的平均交易量
理想情况下,此查询会考虑数据中的差距,即基于日期计算而不是计算前/后行。
例如我有(这里有 6 分钟的数据):
| Time | Open | Close | High | Low | Volume |
|------------------|------|-------|------|-----|--------|
| 2019-10-30 09:30 | 5 | 10 | 15 | 1 | 125000 |
| 2019-10-30 09:31 | 10 | 15 | 20 | 5 | 100000 |
| 2019-10-30 09:32 | 15 | 20 | 25 | 10 | 120000 |
| 2019-10-30 09:33 | 20 | 25 | 30 | 15 | 10000 |
| 2019-10-30 09:34 | 20 | 22 | 40 | 2 | 13122 |
| 2019-10-30 09:35 | 22 | 30 | 35 | 4 | 15000 | Not factored in, since this would be the first row of the next 5-minute window
我正在尝试编写一个查询(这是 5 分钟聚合的第一个示例):
| Time | Open | Close | High | Low | Volume |
|------------------|------|-------|------|-----|---------|
| 2019-10-30 09:30 | 5 | 30 | 40 | 1 | 50224.4 |
有什么建议吗?我用 OVER 子句及其 PARTITION / RANGE 选项撞墙
您想以 5 分钟为间隔分析数据。您可以使用带有以下分区子句的 window 函数:
partition by datepart(year, t.[time]),
datepart(month, t.[time]),
datepart(day, t.[time]),
datepart(hour, t.[time]),
(datepart(minute, t.[time]) / 5)
查询:
select *
from (
select
t.time,
row_number() over(
partition by datepart(year, [time]),
datepart(month, [time]),
datepart(day, [time]),
datepart(hour, [time]),
(datepart(minute, [time]) / 5)
order by [time]
) [rn],
first_value([open]) over(
partition by datepart(year, [time]),
datepart(month, [time]),
datepart(day, [time]),
datepart(hour, [time]),
(datepart(minute, [time]) / 5)
order by [time]
) [open],
last_value([close]) over(
partition by datepart(year, [time]),
datepart(month, [time]),
datepart(day, [time]),
datepart(hour, [time]),
(datepart(minute, [time]) / 5)
order by [time]
) [close],
max([high]) over (
partition by datepart(year, [time]),
datepart(month, [time]),
datepart(day, [time]),
datepart(hour, [time]),
(datepart(minute, [time]) / 5)
) [high],
min([low]) over (
partition by datepart(year, [time]),
datepart(month, [time]),
datepart(day, [time]),
datepart(hour, [time]),
(datepart(minute, [time]) / 5)
) [low],
avg([volume]) over (
partition by datepart(year, [time]),
datepart(month, [time]),
datepart(day, [time]),
datepart(hour, [time]),
(datepart(minute, [time]) / 5)
) [volume]
from mytable t
) t
where rn = 1
你可以试试这个。
SELECT
MIN([Time]) [Time],
Min([Open]) [Open],
LEAD(Min([Open])) OVER (ORDER BY MIN([Time])) AS [Close],
Max([High]) [High],
Min([Low]) [Low],
Avg(Volume) Volume
FROM SampleData
GROUP BY DATEADD(Minute, -1* DATEPART(Minute, Time) %5, Time)
问题的要点是将日期时间值四舍五入到 5 分钟边界(假设数据类型是 datetime
)可以使用 DATEADD(MINUTE, DATEDIFF(MINUTE, 0, time) / 5 * 5, 0)
来完成。休息是基本 grouping/window 功能:
WITH cte AS (
SELECT clamped_time
, [Open]
, [Close]
, [High]
, [Low]
, [Volume]
, rn1 = ROW_NUMBER() OVER (PARTITION BY clamped_time ORDER BY [Time])
, rn2 = ROW_NUMBER() OVER (PARTITION BY clamped_time ORDER BY [Time] DESC)
FROM t
CROSS APPLY (
SELECT DATEADD(MINUTE, DATEDIFF(MINUTE, 0, time) / 5 * 5, 0)
) AS x(clamped_time)
)
SELECT clamped_time
, MIN(CASE WHEN rn1 = 1 THEN [Open] END) AS [Open]
, MIN(CASE WHEN rn2 = 1 THEN [Close] END) AS [Close]
, MAX([High]) AS [High]
, MIN([Low]) AS [Low]
, AVG([Volume])
FROM cte
GROUP BY clamped_time
我有一个数据库 table,其中包含证券的一分钟开盘价、收盘价、高价、低价和成交量值。我使用的是 SQL Server 2017,但 2019 RC 是一个选项。
我正在尝试找到一个高效的 SQL 服务器查询,可以将这些查询聚合成 5 分钟 windows,其中:
- Open = window 的第一个 Open 值
- 收盘价 = window 的最后收盘价
- High = window 的最大高值
- 低 = window 的最小低值
- 交易量 = window 的平均交易量
理想情况下,此查询会考虑数据中的差距,即基于日期计算而不是计算前/后行。
例如我有(这里有 6 分钟的数据):
| Time | Open | Close | High | Low | Volume | |------------------|------|-------|------|-----|--------| | 2019-10-30 09:30 | 5 | 10 | 15 | 1 | 125000 | | 2019-10-30 09:31 | 10 | 15 | 20 | 5 | 100000 | | 2019-10-30 09:32 | 15 | 20 | 25 | 10 | 120000 | | 2019-10-30 09:33 | 20 | 25 | 30 | 15 | 10000 | | 2019-10-30 09:34 | 20 | 22 | 40 | 2 | 13122 | | 2019-10-30 09:35 | 22 | 30 | 35 | 4 | 15000 | Not factored in, since this would be the first row of the next 5-minute window
我正在尝试编写一个查询(这是 5 分钟聚合的第一个示例):
| Time | Open | Close | High | Low | Volume | |------------------|------|-------|------|-----|---------| | 2019-10-30 09:30 | 5 | 30 | 40 | 1 | 50224.4 |
有什么建议吗?我用 OVER 子句及其 PARTITION / RANGE 选项撞墙
您想以 5 分钟为间隔分析数据。您可以使用带有以下分区子句的 window 函数:
partition by datepart(year, t.[time]),
datepart(month, t.[time]),
datepart(day, t.[time]),
datepart(hour, t.[time]),
(datepart(minute, t.[time]) / 5)
查询:
select *
from (
select
t.time,
row_number() over(
partition by datepart(year, [time]),
datepart(month, [time]),
datepart(day, [time]),
datepart(hour, [time]),
(datepart(minute, [time]) / 5)
order by [time]
) [rn],
first_value([open]) over(
partition by datepart(year, [time]),
datepart(month, [time]),
datepart(day, [time]),
datepart(hour, [time]),
(datepart(minute, [time]) / 5)
order by [time]
) [open],
last_value([close]) over(
partition by datepart(year, [time]),
datepart(month, [time]),
datepart(day, [time]),
datepart(hour, [time]),
(datepart(minute, [time]) / 5)
order by [time]
) [close],
max([high]) over (
partition by datepart(year, [time]),
datepart(month, [time]),
datepart(day, [time]),
datepart(hour, [time]),
(datepart(minute, [time]) / 5)
) [high],
min([low]) over (
partition by datepart(year, [time]),
datepart(month, [time]),
datepart(day, [time]),
datepart(hour, [time]),
(datepart(minute, [time]) / 5)
) [low],
avg([volume]) over (
partition by datepart(year, [time]),
datepart(month, [time]),
datepart(day, [time]),
datepart(hour, [time]),
(datepart(minute, [time]) / 5)
) [volume]
from mytable t
) t
where rn = 1
你可以试试这个。
SELECT
MIN([Time]) [Time],
Min([Open]) [Open],
LEAD(Min([Open])) OVER (ORDER BY MIN([Time])) AS [Close],
Max([High]) [High],
Min([Low]) [Low],
Avg(Volume) Volume
FROM SampleData
GROUP BY DATEADD(Minute, -1* DATEPART(Minute, Time) %5, Time)
问题的要点是将日期时间值四舍五入到 5 分钟边界(假设数据类型是 datetime
)可以使用 DATEADD(MINUTE, DATEDIFF(MINUTE, 0, time) / 5 * 5, 0)
来完成。休息是基本 grouping/window 功能:
WITH cte AS (
SELECT clamped_time
, [Open]
, [Close]
, [High]
, [Low]
, [Volume]
, rn1 = ROW_NUMBER() OVER (PARTITION BY clamped_time ORDER BY [Time])
, rn2 = ROW_NUMBER() OVER (PARTITION BY clamped_time ORDER BY [Time] DESC)
FROM t
CROSS APPLY (
SELECT DATEADD(MINUTE, DATEDIFF(MINUTE, 0, time) / 5 * 5, 0)
) AS x(clamped_time)
)
SELECT clamped_time
, MIN(CASE WHEN rn1 = 1 THEN [Open] END) AS [Open]
, MIN(CASE WHEN rn2 = 1 THEN [Close] END) AS [Close]
, MAX([High]) AS [High]
, MIN([Low]) AS [Low]
, AVG([Volume])
FROM cte
GROUP BY clamped_time