在 SQL 查询中计算 returns

Calculating returns within SQL query

我有过去 10 年多家公司股票价格的数据。我希望能够查询 table 到 return 这些股票中每只股票的年度(日历年)股票价格 return。请注意,每只股票可能不存在相同的日期,因此我尝试使用每只股票的最早和最晚可用日期动态计算 return。

我的 table 看起来像这样:

Date       | Stock    | Price
========== | ======== | =====
2018-01-03 | AAPL     | 200
2018-04-20 | AAPL     | 210
2018-07-10 | AAPL     | 230
2018-10-05 | AAPL     | 250
2018-12-20 | AAPL     | 290
2019-01-06 | AAPL     | 300
2019-06-15 | AAPL     | 280
2019-09-10 | AAPL     | 340
2019-12-28 | AAPL     | 400
2018-01-02 | MSFT     | 80
2018-04-20 | MSFT     | 90
2018-07-10 | MSFT     | 110
2018-10-05 | MSFT     | 100
2018-12-22 | MSFT     | 95
2019-01-10 | MSFT     | 110
2019-04-20 | MSFT     | 105
2019-06-19 | MSFT     | 120
2019-09-11 | MSFT     | 140
2019-12-30 | MSFT     | 150

我想抓取每只股票最早和最新的股价,如下:

Date       | Stock    | Price
========== | ======== | =====
2018-01-03 | AAPL     | 200
2018-12-20 | AAPL     | 290
2019-01-06 | AAPL     | 300
2019-12-28 | AAPL     | 400
2018-01-02 | MSFT     | 80
2018-12-22 | MSFT     | 95
2019-01-10 | MSFT     | 110
2019-12-30 | MSFT     | 150

最后,我正在尝试计算 return(年末价格/年初价格 - 1)

Year  | Stock    | Return
===== | ======== | =====
2018  | AAPL     | 0.45
2019  | AAPL     | 0.3333
2018  | MSFT     | 0.1875
2019  | MSFT     | 0.3636

实现此结果的最有效方法是什么(因为我将 运行 在 10 年内对超过 1000 只股票进行此操作,这可能会导致计算密集型)?

应该不会太差。我已经根据您的示例构建了此查询(加上 2017 年的一行):

DECLARE @stocks TABLE (
    Date    DATETIME,
    Stock   VARCHAR(10),
    Price   MONEY
)

INSERT INTO  @stocks ( Date, Stock, Price )
VALUES
(' 2017-01-03' , 'AAPL', 200),
(' 2018-01-03' , 'AAPL', 200),
(' 2018-04-20' , 'AAPL', 210),
(' 2018-07-10' , 'AAPL', 230),
(' 2018-10-05' , 'AAPL', 250),
(' 2018-12-20' , 'AAPL', 290),
(' 2019-01-06' , 'AAPL', 300),
(' 2019-06-15' , 'AAPL', 280),
(' 2019-09-10' , 'AAPL', 340),
(' 2019-12-28' , 'AAPL', 400),
(' 2018-01-02' , 'MSFT', 80 ),
(' 2018-04-20' , 'MSFT', 90 ),
(' 2018-07-10' , 'MSFT', 110),
(' 2018-10-05' , 'MSFT', 100),
(' 2018-12-22' , 'MSFT', 95 ),
(' 2019-01-10' , 'MSFT', 110),
(' 2019-04-20' , 'MSFT', 105),
(' 2019-06-19' , 'MSFT', 120),
(' 2019-09-11' , 'MSFT', 140),
(' 2019-12-30' , 'MSFT', 150)

SELECT S1.Stock, S1.MinDate, S2.Price, S1.MaxDate, S3.Price
, (S3.Price / S2.Price) - 1 AS 'Return'
FROM (
    SELECT Stock, MIN(date) AS MinDate, MAX(date) AS MaxDate FROM @stocks GROUP BY Stock, YEAR(date)
) AS S1

LEFT JOIN @stocks AS S2
ON S2.Stock = S1.Stock
AND S2.Date = S1.MinDate

LEFT JOIN @stocks AS S3
ON S3.Stock = S1.Stock
AND S3.Date = S1.MaxDate

ORDER BY S1.Stock, YEAR(S1.MinDate)

然而,您有过去 10 年的数据,然后您可以尝试使用 window function(最小值、最大值)进行更快速的查询。 Window 函数根据一组行和每个组的 return 多行计算聚合值。首先,使用 window function 获取最大和最小日期,然后使用 WHERE 过滤值,最后使用 aggregate function 获取该值的相应价格(不需要使用不同的):

--just get Price corresponding to min/max date grouping by Year,Stock
select Year,Stock,  max(case when Date=max_date then Price end)/max(case when Date=min_date then Price end)-1 as [Return] from 
    (
    --get the MIN and MAX date partition by Year,Stock
    select *,min(Date)over(partition by Stock,datepart(yyyy,Date))min_date,
max(Date)over(partition by Stock,datepart(yyyy,Date))max_date,
    datepart(yyyy,Date)Year from Table
    )X
   where min_date=Date or Date=max_date
   group by Stock,Year

一个不用子查询的有趣方法是:

select distinct stock, year(date),
       first_value(price) over (partition by stock, year(date) order by date) as first_price,
       first_value(price) over (partition by stock, year(date) order by date desc) as last_price,
       (first_value(price) over (partition by stock, year(date) order by date desc) / 
        first_value(price) over (partition by stock, year(date) order by date) - 1
       ) as return
from t;