在 pandas 中有效地找到期货数据的近月合约

efficiently finding a front-month contract on futures data in pandas

在金融领域,对 futures 策略进行投资研究可能很复杂。最简单形式的期货是两方之间的协议:持有空头头寸的一方——同意交付商品的一方;多头头寸——同意接收商品的一方。

这是我的期货数据的 DataFrame。我使用多索引结构来存储它,其中外层值是底层(f.e。SP是S&P500),内层值是每日时间戳,包含当天交易的合约数据:

                    dataPoint   RIC ExpirationDate          Exchange Description    Open       Low      High   Last  Settlement Price    Ask    Bid  Volume OpenInterest BlockVolume  TotalVolume2 SecurityDescription SecurityLongDescription
RICRoot closingDate                                                                                                                                                                                                                           
SP      2018-03-15          0  SPH0     2020-03-20  CME:Index and Options Market     NaN       NaN       NaN    NaN           2811.20    NaN    NaN       0          NaN          NaN          NaN      SP500 IDX MAR0                     NaN
        2018-03-16          1  SPH0     2020-03-20  CME:Index and Options Market     NaN  2,812.20  2,817.20    NaN           2812.30    NaN    NaN       0          NaN          NaN          NaN      SP500 IDX MAR0                     NaN
        2018-03-19          2  SPH0     2020-03-20  CME:Index and Options Market     NaN  2,753.30  2,772.30    NaN           2778.90    NaN    NaN       0          NaN          NaN          NaN      SP500 IDX MAR0                     NaN
        2018-03-20          3  SPH0     2020-03-20  CME:Index and Options Market     NaN  2,770.90  2,782.90    NaN           2779.20    NaN    NaN       0          NaN          NaN          NaN      SP500 IDX MAR0                     NaN
        2018-03-21          4  SPH0     2020-03-20  CME:Index and Options Market     NaN  2,770.20  2,796.20    NaN           2774.10    NaN    NaN       0          NaN          NaN          NaN      SP500 IDX MAR0                     NaN
        2018-03-22          5  SPH0     2020-03-20  CME:Index and Options Market  2727.1  2,699.10  2,743.10    NaN           2700.60    NaN    NaN       0          NaN          NaN          NaN      SP500 IDX MAR0                     NaN
        2018-03-23          6  SPH0     2020-03-20  CME:Index and Options Market     NaN  2,650.60  2,712.60    NaN           2655.40    NaN    NaN       0          NaN          NaN          NaN      SP500 IDX MAR0                     NaN
        2018-03-26          7  SPH0     2020-03-20  CME:Index and Options Market     NaN  2,663.40  2,719.40    NaN           2716.30    NaN    NaN       0          NaN          NaN          NaN      SP500 IDX MAR0                     NaN
        2018-03-27          8  SPH0     2020-03-20  CME:Index and Options Market     NaN  2,654.30  2,727.30    NaN           2673.00    NaN    NaN       0          NaN          NaN          NaN      SP500 IDX MAR0                     NaN
        2018-03-28          9  SPH0     2020-03-20  CME:Index and Options Market     NaN     2,662     2,686    NaN           2666.00    NaN    NaN       0          NaN          NaN          NaN      SP500 IDX MAR0                     NaN
        2018-03-29         10  SPH0     2020-03-20  CME:Index and Options Market     NaN     2,707     2,710    NaN           2701.70    NaN    NaN       0          NaN          NaN          NaN      SP500 IDX MAR0                     NaN
        2018-04-02         11  SPH0     2020-03-20  CME:Index and Options Market     NaN       NaN       NaN    NaN           2634.40    NaN    NaN       0          NaN          NaN          NaN      SP500 IDX MAR0                     NaN
        2018-04-03         12  SPH0     2020-03-20  CME:Index and Options Market     NaN       NaN       NaN    NaN           2673.00    NaN    NaN       0          NaN          NaN          NaN      SP500 IDX MAR0                     NaN
        2018-04-04         13  SPH0     2020-03-20  CME:Index and Options Market     NaN       NaN       NaN    NaN           2706.80    NaN    NaN       0          NaN          NaN          NaN      SP500 IDX MAR0                     NaN
        2018-04-05         14  SPH0     2020-03-20  CME:Index and Options Market     NaN       NaN       NaN    NaN           2721.90    NaN    NaN       0          NaN          NaN          NaN      SP500 IDX MAR0                     NaN
        2018-04-06         15  SPH0     2020-03-20  CME:Index and Options Market     NaN  2,648.90  2,694.90    NaN           2666.40    NaN    NaN       0          NaN          NaN          NaN      SP500 IDX MAR0                     NaN
        2018-04-09         16  SPH0     2020-03-20  CME:Index and Options Market     NaN  2,670.40  2,696.40    NaN           2677.80    NaN    NaN       0          NaN          NaN          NaN      SP500 IDX MAR0                     NaN
        2018-04-10         17  SPH0     2020-03-20  CME:Index and Options Market     NaN  2,649.80  2,721.80    NaN           2713.30    NaN    NaN       0          NaN          NaN          NaN      SP500 IDX MAR0                     NaN
        2018-04-11         18  SPH0     2020-03-20  CME:Index and Options Market     NaN  2,697.30  2,711.30    NaN           2700.10    NaN    NaN       0          NaN          NaN          NaN      SP500 IDX MAR0                     NaN
        2018-04-12         19  SPH0     2020-03-20  CME:Index and Options Market     NaN  2,722.10  2,733.10    NaN           2723.30    NaN    NaN       0          NaN          NaN          NaN      SP500 IDX MAR0                     NaN
        2018-04-13         20  SPH0     2020-03-20  CME:Index and Options Market     NaN  2,714.30  2,739.30    NaN           2716.00    NaN    NaN       0          NaN          NaN          NaN      SP500 IDX MAR0                     NaN
        2018-04-16         21  SPH0     2020-03-20  CME:Index and Options Market     NaN     2,729     2,742    NaN           2740.40    NaN    NaN       0          NaN          NaN          NaN      SP500 IDX MAR0                     NaN
        2018-04-17         22  SPH0     2020-03-20  CME:Index and Options Market     NaN       NaN  2,772.40    NaN           2763.60    NaN    NaN       0          NaN          NaN          NaN      SP500 IDX MAR0                     NaN
        2018-04-18         23  SPH0     2020-03-20  CME:Index and Options Market     NaN  2,761.60  2,773.60    NaN           2769.60    NaN    NaN       0          NaN          NaN          NaN      SP500 IDX MAR0                     NaN
        2018-04-19         24  SPH0     2020-03-20  CME:Index and Options Market     NaN  2,745.60  2,760.60    NaN           2752.80    NaN    NaN       0          NaN          NaN          NaN      SP500 IDX MAR0                     NaN
        2018-04-20         25  SPH0     2020-03-20  CME:Index and Options Market     NaN  2,725.80  2,754.30    NaN           2731.00    NaN    NaN       0          NaN          NaN          NaN      SP500 IDX MAR0                     NaN
        2018-04-23         26  SPH0     2020-03-20  CME:Index and Options Market     NaN     2,718     2,742    NaN           2731.00    NaN    NaN       0          NaN          NaN          NaN      SP500 IDX MAR0                     NaN
        2018-04-24         27  SPH0     2020-03-20  CME:Index and Options Market     NaN     2,678     2,741    NaN           2693.30    NaN    NaN       0          NaN          NaN          NaN      SP500 IDX MAR0                     NaN
        2018-04-25         28  SPH0     2020-03-20  CME:Index and Options Market     NaN  2,670.30  2,703.30    NaN           2702.90    NaN    NaN       0          NaN          NaN          NaN      SP500 IDX MAR0                     NaN
        2018-04-26         29  SPH0     2020-03-20  CME:Index and Options Market     NaN  2,704.90  2,731.90    NaN           2733.70    NaN    NaN       0          NaN          NaN          NaN      SP500 IDX MAR0                     NaN
                      ...   ...            ...                           ...     ...       ...       ...    ...               ...    ...    ...     ...          ...          ...          ...                 ...                     ...
DM      2009-11-06     162444  DMZ9     2009-12-18  CME:Index and Options Market   680.6     671.7     685.4  678.5            678.70  678.8  678.7  43,534      106,385          NaN          NaN    MIDC E-MINI DEC9                     NaN
        2009-11-09     162445  DMZ9     2009-12-18  CME:Index and Options Market   678.9     678.8     696.6  694.8            694.60    NaN    NaN  32,820      105,906          NaN          NaN    MIDC E-MINI DEC9                     NaN
        2009-11-10     162446  DMZ9     2009-12-18  CME:Index and Options Market   694.7     690.4       699  696.5            696.50    NaN    NaN  32,148      105,693          NaN          NaN    MIDC E-MINI DEC9                     NaN
        2009-11-11     162447  DMZ9     2009-12-18  CME:Index and Options Market   696.5     695.9     706.4  700.9            700.90  700.9  700.6  29,715      106,456          NaN          NaN    MIDC E-MINI DEC9                     NaN
        2009-11-12     162448  DMZ9     2009-12-18  CME:Index and Options Market   701.2     689.3     704.7  692.5            692.60  692.7  692.5  36,389      107,187          NaN          NaN    MIDC E-MINI DEC9                     NaN
        2009-11-13     162449  DMZ9     2009-12-18  CME:Index and Options Market   692.1     687.7     699.9  697.3            697.20  697.2  697.1  33,429      108,543          NaN          NaN    MIDC E-MINI DEC9                     NaN
        2009-11-16     162450  DMZ9     2009-12-18  CME:Index and Options Market   697.7     697.7     713.1  707.8            707.80  707.8  707.7  31,427      109,069          NaN          NaN    MIDC E-MINI DEC9                     NaN
        2009-11-17     162451  DMZ9     2009-12-18  CME:Index and Options Market   707.8     703.8     709.5  707.2            707.20    NaN    NaN  28,752      110,831          NaN          NaN    MIDC E-MINI DEC9                     NaN
        2009-11-18     162452  DMZ9     2009-12-18  CME:Index and Options Market   707.2     702.1     710.1  703.9            704.00    704  703.9  26,820      110,413          NaN          NaN    MIDC E-MINI DEC9                     NaN
        2009-11-19     162453  DMZ9     2009-12-18  CME:Index and Options Market   704.5     686.5     705.1  691.8            692.10  692.1  691.8  31,941      109,160          NaN          NaN    MIDC E-MINI DEC9                     NaN
        2009-11-20     162454  DMZ9     2009-12-18  CME:Index and Options Market   691.8     682.8     692.2  686.7            686.60    687  686.8  23,836      109,060          NaN          NaN    MIDC E-MINI DEC9                     NaN
        2009-11-23     162455  DMZ9     2009-12-18  CME:Index and Options Market   685.6     685.6     703.1  693.5            693.40    NaN    NaN  29,003      110,324          NaN          NaN    MIDC E-MINI DEC9                     NaN
        2009-11-24     162456  DMZ9     2009-12-18  CME:Index and Options Market   693.0     685.6     703.1  690.2            690.20  690.4  690.2  24,880      109,864          NaN          NaN    MIDC E-MINI DEC9                     NaN
        2009-11-25     162457  DMZ9     2009-12-18  CME:Index and Options Market   690.9     690.5     697.9  696.4            696.20    NaN    NaN  18,158      110,711          NaN          NaN    MIDC E-MINI DEC9                     NaN
        2009-11-27     162458  DMZ9     2009-12-18  CME:Index and Options Market   696.8     664.2     697.9  680.4            680.80  680.9  680.3  24,338      110,767          NaN          NaN    MIDC E-MINI DEC9                     NaN
        2009-11-30     162459  DMZ9     2009-12-18  CME:Index and Options Market   683.2     675.7     688.3  682.6            684.00    NaN    NaN  43,854      109,066          NaN          NaN    MIDC E-MINI DEC9                     NaN
        2009-12-01     162460  DMZ9     2009-12-18  CME:Index and Options Market   682.2     675.7     697.1  693.8            693.80    NaN    NaN  30,998      109,759          NaN          NaN    MIDC E-MINI DEC9                     NaN
        2009-12-02     162461  DMZ9     2009-12-18  CME:Index and Options Market   693.8     681.5     703.6  699.3            699.00  699.3  698.9  30,303      109,894          NaN          NaN    MIDC E-MINI DEC9                     NaN
        2009-12-03     162462  DMZ9     2009-12-18  CME:Index and Options Market   698.9     690.3     705.6  690.5            690.50  690.5  690.3  30,782      108,935          NaN          NaN    MIDC E-MINI DEC9                     NaN
        2009-12-04     162463  DMZ9     2009-12-18  CME:Index and Options Market   690.3     689.1     708.7  702.1            702.00  702.2  701.9  41,948      110,221          NaN          NaN    MIDC E-MINI DEC9                     NaN
        2009-12-07     162464  DMZ9     2009-12-18  CME:Index and Options Market   702.6     695.4     705.8  700.2            700.30  700.4  700.3  27,204      109,477          NaN          NaN    MIDC E-MINI DEC9                     NaN
        2009-12-08     162465  DMZ9     2009-12-18  CME:Index and Options Market   700.7     688.8     705.8  693.0            692.90    693  692.9  40,485      107,389          NaN          NaN    MIDC E-MINI DEC9                     NaN
        2009-12-09     162466  DMZ9     2009-12-18  CME:Index and Options Market   693.0     688.4     703.5  695.8            695.80  695.8  695.4  48,281       90,618          NaN          NaN    MIDC E-MINI DEC9                     NaN
        2009-12-10     162467  DMZ9     2009-12-18  CME:Index and Options Market   696.1     688.4     704.6  701.4            701.20  701.4  700.9  37,863       71,260          NaN          NaN    MIDC E-MINI DEC9                     NaN
        2009-12-11     162468  DMZ9     2009-12-18  CME:Index and Options Market   699.3     693.1     708.2  705.9            705.70  706.1  705.9  23,273       54,474          NaN          NaN    MIDC E-MINI DEC9                     NaN
        2009-12-14     162469  DMZ9     2009-12-18  CME:Index and Options Market   711.5     704.3     716.1  713.5            713.50  713.7  713.4  20,099       37,606          NaN          NaN    MIDC E-MINI DEC9                     NaN
        2009-12-15     162470  DMZ9     2009-12-18  CME:Index and Options Market   713.5     704.3     717.8  715.2            715.00    NaN    NaN  14,287       26,672          NaN          NaN    MIDC E-MINI DEC9                     NaN
        2009-12-16     162471  DMZ9     2009-12-18  CME:Index and Options Market   720.1     710.5     722.3  717.9            717.80    NaN    NaN  11,968       19,930          NaN          NaN    MIDC E-MINI DEC9                     NaN
        2009-12-17     162472  DMZ9     2009-12-18  CME:Index and Options Market   716.7     709.8     718.5  714.4            714.80    NaN    NaN   2,540       17,058          NaN          NaN    MIDC E-MINI DEC9                     NaN
        2009-12-18     162473  DMZ9     2009-12-18  CME:Index and Options Market     NaN       NaN       720    NaN            714.83  720.2    NaN     NaN          NaN          NaN          NaN    MIDC E-MINI DEC9                     NaN

期货的有趣和独特之处在于,到期日较远的合约,正如您在上面 2020 年到期的合约中看到的那样,实际上没有交易量。以下是 2018 年 3 月 15 日使用以下代码获取的所有期货交易合约的列表:

In [32]: subdf = df.loc["SP"]

In [33]: today_df = subdf.loc[[pd.to_datetime("2018-03-15")]]

In [34]: today_df
Out[34]: 
            dataPoint   RIC ExpirationDate          Exchange Description    Open       Low   High    Last  Settlement Price  Ask  Bid Volume OpenInterest BlockVolume  TotalVolume2 SecurityDescription SecurityLongDescription
closingDate                                                                                                                                                                                                                    
2018-03-15          0  SPH0     2020-03-20  CME:Index and Options Market     NaN       NaN    NaN     NaN            2811.2  NaN  NaN      0          NaN          NaN          NaN      SP500 IDX MAR0                     NaN
2018-03-15      12962  SPH8     2018-03-16  CME:Index and Options Market  2757.0     2,745  2,762  2749.5            2751.6  NaN  NaN  3,150       59,617          NaN          NaN      SP500 IDX MAR8                     NaN
2018-03-15      13215  SPH9     2019-03-15  CME:Index and Options Market     NaN       NaN    NaN     NaN            2772.4  NaN  NaN      0          NaN          NaN          NaN      SP500 IDX MAR9                     NaN
2018-03-15      26461  SPM8     2018-06-15  CME:Index and Options Market  2756.7  2,745.40  2,767  2757.9            2755.6  NaN  NaN  1,965       27,404          NaN          NaN      SP500 IDX JUN8                     NaN
2018-03-15      29007  SPM9     2019-06-21  CME:Index and Options Market     NaN       NaN    NaN     NaN            2776.6  NaN  NaN      0          NaN          NaN          NaN      SP500 IDX JUN9                     NaN
2018-03-15      41063  SPU8     2018-09-21  CME:Index and Options Market     NaN       NaN    NaN     NaN            2762.2  NaN  NaN      3           25          NaN          NaN      SP500 IDX SEP8                     NaN
2018-03-15      43600  SPU9     2019-09-20  CME:Index and Options Market     NaN       NaN    NaN     NaN            2784.0  NaN  NaN      0          NaN          NaN          NaN      SP500 IDX SEP9                     NaN
2018-03-15      44765  SPZ0     2020-12-18  CME:Index and Options Market     NaN       NaN    NaN     NaN            2831.0  NaN  NaN      0          NaN          NaN          NaN      SP500 IDX DEC0                     NaN
2018-03-15      45912  SPZ1     2021-12-17  CME:Index and Options Market     NaN       NaN    NaN     NaN            2885.1  NaN  NaN      0          NaN          NaN          NaN      SP500 IDX DEC1                     NaN
2018-03-15      48719  SPZ2     2022-12-16  CME:Index and Options Market     NaN       NaN    NaN     NaN            2935.1  NaN  NaN      0          NaN          NaN          NaN      SP500 IDX DEC2                     NaN
2018-03-15      60616  SPZ8     2018-12-21  CME:Index and Options Market     NaN       NaN    NaN     NaN            2767.3  NaN  NaN      0          300          NaN          NaN      SP500 IDX DEC8                     NaN
2018-03-15      62668  SPZ9     2019-12-20  CME:Index and Options Market     NaN       NaN    NaN     NaN            2791.4  NaN  NaN      0          NaN          NaN          NaN      SP500 IDX DEC9                     NaN

从"ExpirationDate"栏可以看出,有很多合约在不同时间到期,即月合约 是最接近您当前日期的日期,通常在 3 个月内。所以很多研究者在交易的时候都会获取即月合约,我有代码可以做到:

######## COLLECT AND ADJUST DATA ########
df = pd.read_csv("Futures Data.csv", header=0, thousands=",", dtype="str")
df["closingDate"] = pd.to_datetime(df["closingDate"])
df["ExpirationDate"] = pd.to_datetime(df["ExpirationDate"])
underlierList = df.RICRoot.unique()
start_date = pd.to_datetime("7/1/1982")
end_date = pd.to_datetime('1/1/2012')
businessDays = pd.bdate_range(start_date, end_date)


####### ITERATE OVER TIMESPAN #######
while current_date <= end_date:

    if current_date not in businessDays:
        current_date += pd.DateOffset(days=1)
        continue

    for underlier in underlierList:
        ########## SEPERATE DATA BY UNDERLIER ##########
        subdf = df.loc[underlier]

        ########## ACQUIRE CURRENT TRADING DAYS FOR ALL HORIZON CONTRACT LENGTHS##########
        try:
            today_df = subdf.loc[[current_date]]
            yesterday_date = current_date-pd.DateOffset(days=1)
            yesterday_df = subdf.loc[[yesterday_date]]
            dataPresence = True
        except:
            continue

        ########## FIND FRONT MONTH CONTRACT ##########
        v = pd.to_datetime(today_df.reset_index()['ExpirationDate'])
        idx = (v.mask(v < current_date) - pd.to_datetime(current_date)).abs().idxmin()
        today_row = today_df.iloc[idx, :]

        ######### HANDLE MISSING YESTERDAY ##########
        for i in range(8):
            try:
                yesterday_row = yesterday_df.iloc[idx, :]
            except IndexError:
                continue
           ######### PRINT CURRENT DATA ##########
        print(underlier)
        print(" Current Trade Day  :  Expiration Date  ")
        print(yesterday_row.name, ":", yesterday_row["ExpirationDate"])
        print(today_row.name, ":", today_row["ExpirationDate"])
        print("----------------------------------------")

但是,这非常慢。而当运行这段代码放在很长的范围内时,就变得异常耗时。是否有更有效的逻辑来解决这个问题,如果有,我该如何实施?也许我什至应该更改我的初始数据结构(多索引)。

您可以按 ExpirationDate 排序并取前两行:

import pandas as pd

df = pd.DataFrame([
    {'closingDate': '2018-03-11', 'ExpirationDate': '2020-03-20'},
    {'closingDate': '2018-03-11', 'ExpirationDate': '2019-03-20'},
    {'closingDate': '2018-03-11', 'ExpirationDate': '2019-04-20'},
    {'closingDate': '2018-03-11', 'ExpirationDate': '2018-12-20'},
    {'closingDate': '2018-03-15', 'ExpirationDate': '2020-03-20'},
    {'closingDate': '2018-03-15', 'ExpirationDate': '2019-03-20'},
    {'closingDate': '2018-03-15', 'ExpirationDate': '2019-04-20'},
    {'closingDate': '2018-03-15', 'ExpirationDate': '2018-12-20'},
])

df['closingDate'] = pd.to_datetime(df['closingDate'])
df['ExpirationDate'] = pd.to_datetime(df['ExpirationDate'])

today_df = df[df['closingDate'] == '2018-03-15']
today_df.sort_values('ExpirationDate').head(2)
#   ExpirationDate closingDate
# 7     2018-12-20  2018-03-15
# 5     2019-03-20  2018-03-15

如果您想同时获得所有 closingDate 的两个最低 ExpirationDate,请使用

df.groupby('closingDate').apply(lambda g: g.sort_values('ExpirationDate').head(2))
#               ExpirationDate closingDate
# closingDate                             
# 2018-03-11  3     2018-12-20  2018-03-11
#             1     2019-03-20  2018-03-11
# 2018-03-15  7     2018-12-20  2018-03-15
#             5     2019-03-20  2018-03-15

或者,您可以同时按 closingDateExpirationDate 排序,而不是在每个组中排序,然后只获取每个组中最上面的两行,完全向量化操作:

df = df.sort_values(['closingDate', 'ExpirationDate'])
df.groupby('closingDate').head(2)
#   ExpirationDate closingDate
# 3     2018-12-20  2018-03-11
# 1     2019-03-20  2018-03-11
# 7     2018-12-20  2018-03-15
# 5     2019-03-20  2018-03-15