Pandas 有效地插入更大数据帧的部分
Pandas efficiently interpolate sections of a larger dataframe
我有数据来自 bloomberg API,返回的数据帧是一个大的单一数据帧
(66 行,5 列),然后通过 websocket 作为单数 JSON str.
在一个块中发送出去
我需要对这个 66 行数据帧进行线性/简单插值,但是这种插值必须针对每种货币单独执行(例如 KWN = 韩元,价格约为 1190,而人民币仅为 6 左右,所以我们无法在货币之间进行插值)。
我目前在 index.str 上过滤我的数据框效率非常低,因此前 3 个字符匹配迭代选择的货币。
如果有人有任何想法来帮助加快这一切/提示,我将非常感激。非常感谢:)
self.ccy_prefix = ['KWN', 'IRN', 'NTN', 'IHN', 'PPN', 'CCN']
for ccy in self.ccy_prefix:
#interp for each section
df[df.index.str[:3]==ccy] = df[df.index.str[:3]==ccy].interpolate(method='linear', limit_direction='forward', axis=0)
这确实成功了,但是它非常缓慢且效率低下,有没有办法使用地图或其他一些聪明的方法 Pandas,我试图找到一个替代方法但找不到到目前为止的任何事情。
过滤插值DF:
大型单数DataFrame:
原始数据帧:
{'BID': {'KWN+1W BGN Curncy': 1192.83, 'KWN+1M BGN Curncy': 1193.46, 'KWN+2M BGN Curncy': 1194.2, 'KWN+3M BGN Curncy': 1194.68, 'KWN+6M BGN Curncy': 1195.74, 'KWN+9M BGN Curncy': 1196.15, 'KWN+12M BGN Curncy': 1195.99, 'KWN+2Y BGN Curncy': 1195.57, 'KWN+3Y BGN Curncy': 1194.0, 'KWN+4Y BGN Curncy': nan, 'KWN+5Y BGN Curncy': 1188.95, 'IRN+1W BGN Curncy': 74.61, 'IRN+1M BGN Curncy': 74.83, 'IRN+2M BGN Curncy': 75.07, 'IRN+3M BGN Curncy': 75.51, 'IRN+6M BGN Curncy': 76.37, 'IRN+9M BGN Curncy': 77.22, 'IRN+12M BGN Curncy': 78.07, 'IRN+2Y BGN Curncy': 81.63, 'IRN+3Y BGN Curncy': nan, 'IRN+4Y BGN Curncy': 87.98, 'IRN+5Y BGN Curncy': 91.65, 'NTN+1W BGN Curncy': 27.576, 'NTN+1M BGN Curncy': 27.517, 'NTN+2M BGN Curncy': 27.442, 'NTN+3M BGN Curncy': 27.372, 'NTN+6M BGN Curncy': 27.174, 'NTN+9M BGN Curncy': 26.98, 'NTN+12M BGN Curncy': 26.784, 'NTN+2Y BGN Curncy': nan, 'NTN+3Y BGN Curncy': nan, 'NTN+4Y BGN Curncy': nan, 'NTN+5Y BGN Curncy': nan, 'IHN+1W BGN Curncy': 14337.8, 'IHN+1M BGN Curncy': 14369.2, 'IHN+2M BGN Curncy': 14417.0, 'IHN+3M BGN Curncy': 14448.8, 'IHN+6M BGN Curncy': 14595.9, 'IHN+9M BGN Curncy': 14703.8, 'IHN+12M BGN Curncy': 14896.0, 'IHN+2Y BGN Curncy': 15504.8, 'IHN+3Y BGN Curncy': nan, 'IHN+4Y BGN Curncy': nan, 'IHN+5Y BGN Curncy': nan, 'PPN+1W BGN Curncy': 51.58, 'PPN+1M BGN Curncy': 51.81, 'PPN+2M BGN Curncy': 52.01, 'PPN+3M BGN Curncy': 52.15, 'PPN+6M BGN Curncy': 52.56, 'PPN+9M BGN Curncy': 52.89, 'PPN+12M BGN Curncy': 53.17, 'PPN+2Y BGN Curncy': 54.32, 'PPN+3Y BGN Curncy': 55.68, 'PPN+4Y BGN Curncy': 56.46, 'PPN+5Y BGN Curncy': 57.72, 'CCN+1W BGN Curncy': 6.361, 'CCN+1M BGN Curncy': 6.373, 'CCN+2M BGN Curncy': 6.3853, 'CCN+3M BGN Curncy': 6.3976, 'CCN+6M BGN Curncy': 6.428, 'CCN+9M BGN Curncy': 6.4541, 'CCN+12M BGN Curncy': 6.4776, 'CCN+2Y BGN Curncy': 6.5653, 'CCN+3Y BGN Curncy': 6.6229, 'CCN+4Y BGN Curncy': 6.7332, 'CCN+5Y BGN Curncy': 6.8305}, 'ASK': {'KWN+1W BGN Curncy': 1193.65, 'KWN+1M BGN Curncy': 1194.46, 'KWN+2M BGN Curncy': 1195.2, 'KWN+3M BGN Curncy': 1195.72, 'KWN+6M BGN Curncy': 1197.06, 'KWN+9M BGN Curncy': 1197.48, 'KWN+12M BGN Curncy': 1197.81, 'KWN+2Y BGN Curncy': 1197.28, 'KWN+3Y BGN Curncy': 1195.0, 'KWN+4Y BGN Curncy': nan, 'KWN+5Y BGN Curncy': 1189.95, 'IRN+1W BGN Curncy': 74.65, 'IRN+1M BGN Curncy': 74.88, 'IRN+2M BGN Curncy': 75.12, 'IRN+3M BGN Curncy': 75.56, 'IRN+6M BGN Curncy': 76.42, 'IRN+9M BGN Curncy': 77.28, 'IRN+12M BGN Curncy': 78.14, 'IRN+2Y BGN Curncy': 81.68, 'IRN+3Y BGN Curncy': nan, 'IRN+4Y BGN Curncy': 89.98, 'IRN+5Y BGN Curncy': 91.99, 'NTN+1W BGN Curncy': 27.606, 'NTN+1M BGN Curncy': 27.533, 'NTN+2M BGN Curncy': 27.472, 'NTN+3M BGN Curncy': 27.402, 'NTN+6M BGN Curncy': 27.204, 'NTN+9M BGN Curncy': 27.014, 'NTN+12M BGN Curncy': 26.829, 'NTN+2Y BGN Curncy': nan, 'NTN+3Y BGN Curncy': nan, 'NTN+4Y BGN Curncy': nan, 'NTN+5Y BGN Curncy': nan, 'IHN+1W BGN Curncy': 14378.0, 'IHN+1M BGN Curncy': 14401.0, 'IHN+2M BGN Curncy': 14439.0, 'IHN+3M BGN Curncy': 14499.7, 'IHN+6M BGN Curncy': 14652.1, 'IHN+9M BGN Curncy': 14803.2, 'IHN+12M BGN Curncy': 14965.0, 'IHN+2Y BGN Curncy': 15545.2, 'IHN+3Y BGN Curncy': nan, 'IHN+4Y BGN Curncy': nan, 'IHN+5Y BGN Curncy': nan, 'PPN+1W BGN Curncy': 51.63, 'PPN+1M BGN Curncy': 51.86, 'PPN+2M BGN Curncy': 52.07, 'PPN+3M BGN Curncy': 52.22, 'PPN+6M BGN Curncy': 52.6, 'PPN+9M BGN Curncy': 52.99, 'PPN+12M BGN Curncy': 53.29, 'PPN+2Y BGN Curncy': 54.4, 'PPN+3Y BGN Curncy': 55.8, 'PPN+4Y BGN Curncy': 57.06, 'PPN+5Y BGN Curncy': 58.1, 'CCN+1W BGN Curncy': 6.366, 'CCN+1M BGN Curncy': 6.3781, 'CCN+2M BGN Curncy': 6.3911, 'CCN+3M BGN Curncy': 6.4026, 'CCN+6M BGN Curncy': 6.433, 'CCN+9M BGN Curncy': 6.4591, 'CCN+12M BGN Curncy': 6.4846, 'CCN+2Y BGN Curncy': 6.5753, 'CCN+3Y BGN Curncy': 6.6441, 'CCN+4Y BGN Curncy': 6.7483, 'CCN+5Y BGN Curncy': 6.8405}, 'MID': {'KWN+1W BGN Curncy': 1193.24, 'KWN+1M BGN Curncy': 1193.96, 'KWN+2M BGN Curncy': 1194.7, 'KWN+3M BGN Curncy': 1195.2, 'KWN+6M BGN Curncy': 1196.4, 'KWN+9M BGN Curncy': 1196.82, 'KWN+12M BGN Curncy': 1196.9, 'KWN+2Y BGN Curncy': 1196.42, 'KWN+3Y BGN Curncy': 1194.5, 'KWN+4Y BGN Curncy': nan, 'KWN+5Y BGN Curncy': 1189.45, 'IRN+1W BGN Curncy': 74.63, 'IRN+1M BGN Curncy': 74.85, 'IRN+2M BGN Curncy': 75.09, 'IRN+3M BGN Curncy': 75.53, 'IRN+6M BGN Curncy': 76.4, 'IRN+9M BGN Curncy': 77.25, 'IRN+12M BGN Curncy': 78.1, 'IRN+2Y BGN Curncy': 81.65, 'IRN+3Y BGN Curncy': nan, 'IRN+4Y BGN Curncy': 88.98, 'IRN+5Y BGN Curncy': 91.82, 'NTN+1W BGN Curncy': 27.591, 'NTN+1M BGN Curncy': 27.525, 'NTN+2M BGN Curncy': 27.457, 'NTN+3M BGN Curncy': 27.387, 'NTN+6M BGN Curncy': 27.189, 'NTN+9M BGN Curncy': 26.997, 'NTN+12M BGN Curncy': 26.806, 'NTN+2Y BGN Curncy': nan, 'NTN+3Y BGN Curncy': nan, 'NTN+4Y BGN Curncy': nan, 'NTN+5Y BGN Curncy': nan, 'IHN+1W BGN Curncy': 14357.9, 'IHN+1M BGN Curncy': 14385.1, 'IHN+2M BGN Curncy': 14428.0, 'IHN+3M BGN Curncy': 14474.2, 'IHN+6M BGN Curncy': 14624.0, 'IHN+9M BGN Curncy': 14753.5, 'IHN+12M BGN Curncy': 14930.5, 'IHN+2Y BGN Curncy': 15525.0, 'IHN+3Y BGN Curncy': nan, 'IHN+4Y BGN Curncy': nan, 'IHN+5Y BGN Curncy': nan, 'PPN+1W BGN Curncy': 51.6, 'PPN+1M BGN Curncy': 51.83, 'PPN+2M BGN Curncy': 52.04, 'PPN+3M BGN Curncy': 52.18, 'PPN+6M BGN Curncy': 52.58, 'PPN+9M BGN Curncy': 52.94, 'PPN+12M BGN Curncy': 53.23, 'PPN+2Y BGN Curncy': 54.36, 'PPN+3Y BGN Curncy': 55.74, 'PPN+4Y BGN Curncy': 56.76, 'PPN+5Y BGN Curncy': 57.91, 'CCN+1W BGN Curncy': 6.3635, 'CCN+1M BGN Curncy': 6.3755, 'CCN+2M BGN Curncy': 6.3882, 'CCN+3M BGN Curncy': 6.4001, 'CCN+6M BGN Curncy': 6.4305, 'CCN+9M BGN Curncy': 6.4566, 'CCN+12M BGN Curncy': 6.4811, 'CCN+2Y BGN Curncy': 6.5703, 'CCN+3Y BGN Curncy': 6.6335, 'CCN+4Y BGN Curncy': 6.7408, 'CCN+5Y BGN Curncy': 6.8355}, 'LAST_BID_TIME_TODAY_REALTIME': {'KWN+1W BGN Curncy': datetime.time(20, 34, 41), 'KWN+1M BGN Curncy': datetime.time(20, 34, 35), 'KWN+2M BGN Curncy': datetime.time(20, 34, 56), 'KWN+3M BGN Curncy': datetime.time(20, 34, 56), 'KWN+6M BGN Curncy': datetime.time(20, 34, 56), 'KWN+9M BGN Curncy': datetime.time(20, 34, 56), 'KWN+12M BGN Curncy': datetime.time(20, 34, 56), 'KWN+2Y BGN Curncy': datetime.time(20, 34, 56), 'KWN+3Y BGN Curncy': datetime.time(20, 34, 34), 'KWN+4Y BGN Curncy': nan, 'KWN+5Y BGN Curncy': datetime.time(20, 31, 31), 'IRN+1W BGN Curncy': datetime.time(19, 50, 20), 'IRN+1M BGN Curncy': datetime.time(20, 34, 49), 'IRN+2M BGN Curncy': datetime.time(20, 34, 48), 'IRN+3M BGN Curncy': datetime.time(20, 34, 48), 'IRN+6M BGN Curncy': datetime.time(20, 34, 48), 'IRN+9M BGN Curncy': datetime.time(20, 34, 43), 'IRN+12M BGN Curncy': datetime.time(20, 34, 48), 'IRN+2Y BGN Curncy': datetime.time(20, 32, 12), 'IRN+3Y BGN Curncy': nan, 'IRN+4Y BGN Curncy': datetime.time(8, 45, 3), 'IRN+5Y BGN Curncy': datetime.time(20, 31, 35), 'NTN+1W BGN Curncy': datetime.time(20, 31, 17), 'NTN+1M BGN Curncy': datetime.time(20, 34, 35), 'NTN+2M BGN Curncy': datetime.time(20, 31, 30), 'NTN+3M BGN Curncy': datetime.time(20, 31, 30), 'NTN+6M BGN Curncy': datetime.time(20, 31, 30), 'NTN+9M BGN Curncy': datetime.time(18, 0, 42), 'NTN+12M BGN Curncy': datetime.time(18, 0, 42), 'NTN+2Y BGN Curncy': nan, 'NTN+3Y BGN Curncy': NaT, 'NTN+4Y BGN Curncy': NaT, 'NTN+5Y BGN Curncy': NaT, 'IHN+1W BGN Curncy': datetime.time(20, 34, 26), 'IHN+1M BGN Curncy': datetime.time(20, 33, 51), 'IHN+2M BGN Curncy': datetime.time(20, 34, 11), 'IHN+3M BGN Curncy': datetime.time(20, 33, 51), 'IHN+6M BGN Curncy': datetime.time(20, 33, 51), 'IHN+9M BGN Curncy': datetime.time(20, 0, 21), 'IHN+12M BGN Curncy': datetime.time(20, 33, 51), 'IHN+2Y BGN Curncy': datetime.time(19, 27, 44), 'IHN+3Y BGN Curncy': nan, 'IHN+4Y BGN Curncy': NaT, 'IHN+5Y BGN Curncy': NaT, 'PPN+1W BGN Curncy': datetime.time(20, 34, 11), 'PPN+1M BGN Curncy': datetime.time(20, 33, 54), 'PPN+2M BGN Curncy': datetime.time(20, 33, 54), 'PPN+3M BGN Curncy': datetime.time(20, 33, 54), 'PPN+6M BGN Curncy': datetime.time(20, 33, 54), 'PPN+9M BGN Curncy': datetime.time(19, 46, 19), 'PPN+12M BGN Curncy': datetime.time(20, 33, 54), 'PPN+2Y BGN Curncy': datetime.time(16, 5, 40), 'PPN+3Y BGN Curncy': datetime.time(20, 34, 56), 'PPN+4Y BGN Curncy': datetime.time(20, 34, 56), 'PPN+5Y BGN Curncy': datetime.time(20, 34, 56), 'CCN+1W BGN Curncy': datetime.time(20, 34, 28), 'CCN+1M BGN Curncy': datetime.time(20, 34, 28), 'CCN+2M BGN Curncy': datetime.time(20, 34, 28), 'CCN+3M BGN Curncy': datetime.time(20, 34, 28), 'CCN+6M BGN Curncy': datetime.time(20, 34, 28), 'CCN+9M BGN Curncy': datetime.time(20, 34, 28), 'CCN+12M BGN Curncy': datetime.time(20, 34, 28), 'CCN+2Y BGN Curncy': datetime.time(20, 32, 13), 'CCN+3Y BGN Curncy': datetime.time(20, 32, 40), 'CCN+4Y BGN Curncy': datetime.time(20, 32, 13), 'CCN+5Y BGN Curncy': datetime.time(20, 23, 29)}, 'SETTLEMENT_DATE_RT': {'KWN+1W BGN Curncy': datetime.datetime(2022, 1, 27, 0, 0), 'KWN+1M BGN Curncy': datetime.datetime(2022, 2, 22, 0, 0), 'KWN+2M BGN Curncy': datetime.datetime(2022, 3, 21, 0, 0), 'KWN+3M BGN Curncy': datetime.datetime(2022, 4, 20, 0, 0), 'KWN+6M BGN Curncy': datetime.datetime(2022, 7, 20, 0, 0), 'KWN+9M BGN Curncy': datetime.datetime(2022, 10, 20, 0, 0), 'KWN+12M BGN Curncy': datetime.datetime(2023, 1, 20, 0, 0), 'KWN+2Y BGN Curncy': datetime.datetime(2024, 1, 22, 0, 0), 'KWN+3Y BGN Curncy': datetime.datetime(2025, 1, 21, 0, 0), 'KWN+4Y BGN Curncy': datetime.datetime(2026, 1, 20, 0, 0), 'KWN+5Y BGN Curncy': datetime.datetime(2027, 1, 20, 0, 0), 'IRN+1W BGN Curncy': datetime.datetime(2022, 1, 27, 0, 0), 'IRN+1M BGN Curncy': datetime.datetime(2022, 2, 22, 0, 0), 'IRN+2M BGN Curncy': datetime.datetime(2022, 3, 21, 0, 0), 'IRN+3M BGN Curncy': datetime.datetime(2022, 4, 20, 0, 0), 'IRN+6M BGN Curncy': datetime.datetime(2022, 7, 20, 0, 0), 'IRN+9M BGN Curncy': datetime.datetime(2022, 10, 20, 0, 0), 'IRN+12M BGN Curncy': datetime.datetime(2023, 1, 20, 0, 0), 'IRN+2Y BGN Curncy': datetime.datetime(2024, 1, 22, 0, 0), 'IRN+3Y BGN Curncy': datetime.datetime(2025, 1, 21, 0, 0), 'IRN+4Y BGN Curncy': datetime.datetime(2026, 1, 20, 0, 0), 'IRN+5Y BGN Curncy': datetime.datetime(2027, 1, 20, 0, 0), 'NTN+1W BGN Curncy': datetime.datetime(2022, 1, 27, 0, 0), 'NTN+1M BGN Curncy': datetime.datetime(2022, 2, 22, 0, 0), 'NTN+2M BGN Curncy': datetime.datetime(2022, 3, 21, 0, 0), 'NTN+3M BGN Curncy': datetime.datetime(2022, 4, 20, 0, 0), 'NTN+6M BGN Curncy': datetime.datetime(2022, 7, 20, 0, 0), 'NTN+9M BGN Curncy': datetime.datetime(2022, 10, 20, 0, 0), 'NTN+12M BGN Curncy': datetime.datetime(2023, 1, 20, 0, 0), 'NTN+2Y BGN Curncy': datetime.datetime(2024, 1, 22, 0, 0), 'NTN+3Y BGN Curncy': datetime.datetime(2025, 1, 21, 0, 0), 'NTN+4Y BGN Curncy': datetime.datetime(2026, 1, 20, 0, 0), 'NTN+5Y BGN Curncy': datetime.datetime(2027, 1, 20, 0, 0), 'IHN+1W BGN Curncy': datetime.datetime(2022, 1, 27, 0, 0), 'IHN+1M BGN Curncy': datetime.datetime(2022, 2, 22, 0, 0), 'IHN+2M BGN Curncy': datetime.datetime(2022, 3, 21, 0, 0), 'IHN+3M BGN Curncy': datetime.datetime(2022, 4, 20, 0, 0), 'IHN+6M BGN Curncy': datetime.datetime(2022, 7, 20, 0, 0), 'IHN+9M BGN Curncy': datetime.datetime(2022, 10, 20, 0, 0), 'IHN+12M BGN Curncy': datetime.datetime(2023, 1, 20, 0, 0), 'IHN+2Y BGN Curncy': datetime.datetime(2024, 1, 22, 0, 0), 'IHN+3Y BGN Curncy': datetime.datetime(2025, 1, 21, 0, 0), 'IHN+4Y BGN Curncy': datetime.datetime(2026, 1, 20, 0, 0), 'IHN+5Y BGN Curncy': datetime.datetime(2027, 1, 20, 0, 0), 'PPN+1W BGN Curncy': datetime.datetime(2022, 1, 26, 0, 0), 'PPN+1M BGN Curncy': datetime.datetime(2022, 2, 22, 0, 0), 'PPN+2M BGN Curncy': datetime.datetime(2022, 3, 21, 0, 0), 'PPN+3M BGN Curncy': datetime.datetime(2022, 4, 19, 0, 0), 'PPN+6M BGN Curncy': datetime.datetime(2022, 7, 19, 0, 0), 'PPN+9M BGN Curncy': datetime.datetime(2022, 10, 19, 0, 0), 'PPN+12M BGN Curncy': datetime.datetime(2023, 1, 19, 0, 0), 'PPN+2Y BGN Curncy': datetime.datetime(2024, 1, 19, 0, 0), 'PPN+3Y BGN Curncy': datetime.datetime(2025, 1, 21, 0, 0), 'PPN+4Y BGN Curncy': datetime.datetime(2026, 1, 20, 0, 0), 'PPN+5Y BGN Curncy': datetime.datetime(2027, 1, 19, 0, 0), 'CCN+1W BGN Curncy': datetime.datetime(2022, 1, 27, 0, 0), 'CCN+1M BGN Curncy': datetime.datetime(2022, 2, 22, 0, 0), 'CCN+2M BGN Curncy': datetime.datetime(2022, 3, 21, 0, 0), 'CCN+3M BGN Curncy': datetime.datetime(2022, 4, 20, 0, 0), 'CCN+6M BGN Curncy': datetime.datetime(2022, 7, 20, 0, 0), 'CCN+9M BGN Curncy': datetime.datetime(2022, 10, 20, 0, 0), 'CCN+12M BGN Curncy': datetime.datetime(2023, 1, 20, 0, 0), 'CCN+2Y BGN Curncy': datetime.datetime(2024, 1, 22, 0, 0), 'CCN+3Y BGN Curncy': datetime.datetime(2025, 1, 21, 0, 0), 'CCN+4Y BGN Curncy': datetime.datetime(2026, 1, 20, 0, 0), 'CCN+5Y BGN Curncy': datetime.datetime(2027, 1, 20, 0, 0)}}
似乎 groupby
然后插值应该更快。不幸的是,当我 运行 你的代码时,我实际上并没有得到你列出的“过滤插值 DF”(也许你遗漏了插值的某些部分,你指定它应该是 15 分钟的间隔?) .如果您使用 str.startswith
而不是 str[:3]
:
,您会得到轻微的加速
%%timeit
for ccy in ccy_prefix:
df[df.index.str[:3]==ccy] = df[df.index.str[:3]==ccy].interpolate(limit_direction='forward')
% 10 loops, best of 5: 25.9 ms per loop
相对于:
%%timeit
for ccy in ccy_prefix:
df[df.index.str.startswith(ccy)] = df[df.index.str.startswith(ccy)].interpolate(limit_direction='forward')
% 10 loops, best of 5: 24.1 ms per loop
也许更好的解决方案是创建一个带有货币前缀的新列,然后是 groupby
和 interpolate
,根据提供的评论 .
df['ccy_prefix'] = df.index.str[:3]
def interpolator(df):
return(df.interpolate(limit_direction='forward'))
那么这应该是其中最快的:
df = df.groupby('ccy_prefix').apply(interpolator)
我有数据来自 bloomberg API,返回的数据帧是一个大的单一数据帧 (66 行,5 列),然后通过 websocket 作为单数 JSON str.
在一个块中发送出去我需要对这个 66 行数据帧进行线性/简单插值,但是这种插值必须针对每种货币单独执行(例如 KWN = 韩元,价格约为 1190,而人民币仅为 6 左右,所以我们无法在货币之间进行插值)。
我目前在 index.str 上过滤我的数据框效率非常低,因此前 3 个字符匹配迭代选择的货币。
如果有人有任何想法来帮助加快这一切/提示,我将非常感激。非常感谢:)
self.ccy_prefix = ['KWN', 'IRN', 'NTN', 'IHN', 'PPN', 'CCN']
for ccy in self.ccy_prefix:
#interp for each section
df[df.index.str[:3]==ccy] = df[df.index.str[:3]==ccy].interpolate(method='linear', limit_direction='forward', axis=0)
这确实成功了,但是它非常缓慢且效率低下,有没有办法使用地图或其他一些聪明的方法 Pandas,我试图找到一个替代方法但找不到到目前为止的任何事情。
过滤插值DF:
大型单数DataFrame:
原始数据帧:
{'BID': {'KWN+1W BGN Curncy': 1192.83, 'KWN+1M BGN Curncy': 1193.46, 'KWN+2M BGN Curncy': 1194.2, 'KWN+3M BGN Curncy': 1194.68, 'KWN+6M BGN Curncy': 1195.74, 'KWN+9M BGN Curncy': 1196.15, 'KWN+12M BGN Curncy': 1195.99, 'KWN+2Y BGN Curncy': 1195.57, 'KWN+3Y BGN Curncy': 1194.0, 'KWN+4Y BGN Curncy': nan, 'KWN+5Y BGN Curncy': 1188.95, 'IRN+1W BGN Curncy': 74.61, 'IRN+1M BGN Curncy': 74.83, 'IRN+2M BGN Curncy': 75.07, 'IRN+3M BGN Curncy': 75.51, 'IRN+6M BGN Curncy': 76.37, 'IRN+9M BGN Curncy': 77.22, 'IRN+12M BGN Curncy': 78.07, 'IRN+2Y BGN Curncy': 81.63, 'IRN+3Y BGN Curncy': nan, 'IRN+4Y BGN Curncy': 87.98, 'IRN+5Y BGN Curncy': 91.65, 'NTN+1W BGN Curncy': 27.576, 'NTN+1M BGN Curncy': 27.517, 'NTN+2M BGN Curncy': 27.442, 'NTN+3M BGN Curncy': 27.372, 'NTN+6M BGN Curncy': 27.174, 'NTN+9M BGN Curncy': 26.98, 'NTN+12M BGN Curncy': 26.784, 'NTN+2Y BGN Curncy': nan, 'NTN+3Y BGN Curncy': nan, 'NTN+4Y BGN Curncy': nan, 'NTN+5Y BGN Curncy': nan, 'IHN+1W BGN Curncy': 14337.8, 'IHN+1M BGN Curncy': 14369.2, 'IHN+2M BGN Curncy': 14417.0, 'IHN+3M BGN Curncy': 14448.8, 'IHN+6M BGN Curncy': 14595.9, 'IHN+9M BGN Curncy': 14703.8, 'IHN+12M BGN Curncy': 14896.0, 'IHN+2Y BGN Curncy': 15504.8, 'IHN+3Y BGN Curncy': nan, 'IHN+4Y BGN Curncy': nan, 'IHN+5Y BGN Curncy': nan, 'PPN+1W BGN Curncy': 51.58, 'PPN+1M BGN Curncy': 51.81, 'PPN+2M BGN Curncy': 52.01, 'PPN+3M BGN Curncy': 52.15, 'PPN+6M BGN Curncy': 52.56, 'PPN+9M BGN Curncy': 52.89, 'PPN+12M BGN Curncy': 53.17, 'PPN+2Y BGN Curncy': 54.32, 'PPN+3Y BGN Curncy': 55.68, 'PPN+4Y BGN Curncy': 56.46, 'PPN+5Y BGN Curncy': 57.72, 'CCN+1W BGN Curncy': 6.361, 'CCN+1M BGN Curncy': 6.373, 'CCN+2M BGN Curncy': 6.3853, 'CCN+3M BGN Curncy': 6.3976, 'CCN+6M BGN Curncy': 6.428, 'CCN+9M BGN Curncy': 6.4541, 'CCN+12M BGN Curncy': 6.4776, 'CCN+2Y BGN Curncy': 6.5653, 'CCN+3Y BGN Curncy': 6.6229, 'CCN+4Y BGN Curncy': 6.7332, 'CCN+5Y BGN Curncy': 6.8305}, 'ASK': {'KWN+1W BGN Curncy': 1193.65, 'KWN+1M BGN Curncy': 1194.46, 'KWN+2M BGN Curncy': 1195.2, 'KWN+3M BGN Curncy': 1195.72, 'KWN+6M BGN Curncy': 1197.06, 'KWN+9M BGN Curncy': 1197.48, 'KWN+12M BGN Curncy': 1197.81, 'KWN+2Y BGN Curncy': 1197.28, 'KWN+3Y BGN Curncy': 1195.0, 'KWN+4Y BGN Curncy': nan, 'KWN+5Y BGN Curncy': 1189.95, 'IRN+1W BGN Curncy': 74.65, 'IRN+1M BGN Curncy': 74.88, 'IRN+2M BGN Curncy': 75.12, 'IRN+3M BGN Curncy': 75.56, 'IRN+6M BGN Curncy': 76.42, 'IRN+9M BGN Curncy': 77.28, 'IRN+12M BGN Curncy': 78.14, 'IRN+2Y BGN Curncy': 81.68, 'IRN+3Y BGN Curncy': nan, 'IRN+4Y BGN Curncy': 89.98, 'IRN+5Y BGN Curncy': 91.99, 'NTN+1W BGN Curncy': 27.606, 'NTN+1M BGN Curncy': 27.533, 'NTN+2M BGN Curncy': 27.472, 'NTN+3M BGN Curncy': 27.402, 'NTN+6M BGN Curncy': 27.204, 'NTN+9M BGN Curncy': 27.014, 'NTN+12M BGN Curncy': 26.829, 'NTN+2Y BGN Curncy': nan, 'NTN+3Y BGN Curncy': nan, 'NTN+4Y BGN Curncy': nan, 'NTN+5Y BGN Curncy': nan, 'IHN+1W BGN Curncy': 14378.0, 'IHN+1M BGN Curncy': 14401.0, 'IHN+2M BGN Curncy': 14439.0, 'IHN+3M BGN Curncy': 14499.7, 'IHN+6M BGN Curncy': 14652.1, 'IHN+9M BGN Curncy': 14803.2, 'IHN+12M BGN Curncy': 14965.0, 'IHN+2Y BGN Curncy': 15545.2, 'IHN+3Y BGN Curncy': nan, 'IHN+4Y BGN Curncy': nan, 'IHN+5Y BGN Curncy': nan, 'PPN+1W BGN Curncy': 51.63, 'PPN+1M BGN Curncy': 51.86, 'PPN+2M BGN Curncy': 52.07, 'PPN+3M BGN Curncy': 52.22, 'PPN+6M BGN Curncy': 52.6, 'PPN+9M BGN Curncy': 52.99, 'PPN+12M BGN Curncy': 53.29, 'PPN+2Y BGN Curncy': 54.4, 'PPN+3Y BGN Curncy': 55.8, 'PPN+4Y BGN Curncy': 57.06, 'PPN+5Y BGN Curncy': 58.1, 'CCN+1W BGN Curncy': 6.366, 'CCN+1M BGN Curncy': 6.3781, 'CCN+2M BGN Curncy': 6.3911, 'CCN+3M BGN Curncy': 6.4026, 'CCN+6M BGN Curncy': 6.433, 'CCN+9M BGN Curncy': 6.4591, 'CCN+12M BGN Curncy': 6.4846, 'CCN+2Y BGN Curncy': 6.5753, 'CCN+3Y BGN Curncy': 6.6441, 'CCN+4Y BGN Curncy': 6.7483, 'CCN+5Y BGN Curncy': 6.8405}, 'MID': {'KWN+1W BGN Curncy': 1193.24, 'KWN+1M BGN Curncy': 1193.96, 'KWN+2M BGN Curncy': 1194.7, 'KWN+3M BGN Curncy': 1195.2, 'KWN+6M BGN Curncy': 1196.4, 'KWN+9M BGN Curncy': 1196.82, 'KWN+12M BGN Curncy': 1196.9, 'KWN+2Y BGN Curncy': 1196.42, 'KWN+3Y BGN Curncy': 1194.5, 'KWN+4Y BGN Curncy': nan, 'KWN+5Y BGN Curncy': 1189.45, 'IRN+1W BGN Curncy': 74.63, 'IRN+1M BGN Curncy': 74.85, 'IRN+2M BGN Curncy': 75.09, 'IRN+3M BGN Curncy': 75.53, 'IRN+6M BGN Curncy': 76.4, 'IRN+9M BGN Curncy': 77.25, 'IRN+12M BGN Curncy': 78.1, 'IRN+2Y BGN Curncy': 81.65, 'IRN+3Y BGN Curncy': nan, 'IRN+4Y BGN Curncy': 88.98, 'IRN+5Y BGN Curncy': 91.82, 'NTN+1W BGN Curncy': 27.591, 'NTN+1M BGN Curncy': 27.525, 'NTN+2M BGN Curncy': 27.457, 'NTN+3M BGN Curncy': 27.387, 'NTN+6M BGN Curncy': 27.189, 'NTN+9M BGN Curncy': 26.997, 'NTN+12M BGN Curncy': 26.806, 'NTN+2Y BGN Curncy': nan, 'NTN+3Y BGN Curncy': nan, 'NTN+4Y BGN Curncy': nan, 'NTN+5Y BGN Curncy': nan, 'IHN+1W BGN Curncy': 14357.9, 'IHN+1M BGN Curncy': 14385.1, 'IHN+2M BGN Curncy': 14428.0, 'IHN+3M BGN Curncy': 14474.2, 'IHN+6M BGN Curncy': 14624.0, 'IHN+9M BGN Curncy': 14753.5, 'IHN+12M BGN Curncy': 14930.5, 'IHN+2Y BGN Curncy': 15525.0, 'IHN+3Y BGN Curncy': nan, 'IHN+4Y BGN Curncy': nan, 'IHN+5Y BGN Curncy': nan, 'PPN+1W BGN Curncy': 51.6, 'PPN+1M BGN Curncy': 51.83, 'PPN+2M BGN Curncy': 52.04, 'PPN+3M BGN Curncy': 52.18, 'PPN+6M BGN Curncy': 52.58, 'PPN+9M BGN Curncy': 52.94, 'PPN+12M BGN Curncy': 53.23, 'PPN+2Y BGN Curncy': 54.36, 'PPN+3Y BGN Curncy': 55.74, 'PPN+4Y BGN Curncy': 56.76, 'PPN+5Y BGN Curncy': 57.91, 'CCN+1W BGN Curncy': 6.3635, 'CCN+1M BGN Curncy': 6.3755, 'CCN+2M BGN Curncy': 6.3882, 'CCN+3M BGN Curncy': 6.4001, 'CCN+6M BGN Curncy': 6.4305, 'CCN+9M BGN Curncy': 6.4566, 'CCN+12M BGN Curncy': 6.4811, 'CCN+2Y BGN Curncy': 6.5703, 'CCN+3Y BGN Curncy': 6.6335, 'CCN+4Y BGN Curncy': 6.7408, 'CCN+5Y BGN Curncy': 6.8355}, 'LAST_BID_TIME_TODAY_REALTIME': {'KWN+1W BGN Curncy': datetime.time(20, 34, 41), 'KWN+1M BGN Curncy': datetime.time(20, 34, 35), 'KWN+2M BGN Curncy': datetime.time(20, 34, 56), 'KWN+3M BGN Curncy': datetime.time(20, 34, 56), 'KWN+6M BGN Curncy': datetime.time(20, 34, 56), 'KWN+9M BGN Curncy': datetime.time(20, 34, 56), 'KWN+12M BGN Curncy': datetime.time(20, 34, 56), 'KWN+2Y BGN Curncy': datetime.time(20, 34, 56), 'KWN+3Y BGN Curncy': datetime.time(20, 34, 34), 'KWN+4Y BGN Curncy': nan, 'KWN+5Y BGN Curncy': datetime.time(20, 31, 31), 'IRN+1W BGN Curncy': datetime.time(19, 50, 20), 'IRN+1M BGN Curncy': datetime.time(20, 34, 49), 'IRN+2M BGN Curncy': datetime.time(20, 34, 48), 'IRN+3M BGN Curncy': datetime.time(20, 34, 48), 'IRN+6M BGN Curncy': datetime.time(20, 34, 48), 'IRN+9M BGN Curncy': datetime.time(20, 34, 43), 'IRN+12M BGN Curncy': datetime.time(20, 34, 48), 'IRN+2Y BGN Curncy': datetime.time(20, 32, 12), 'IRN+3Y BGN Curncy': nan, 'IRN+4Y BGN Curncy': datetime.time(8, 45, 3), 'IRN+5Y BGN Curncy': datetime.time(20, 31, 35), 'NTN+1W BGN Curncy': datetime.time(20, 31, 17), 'NTN+1M BGN Curncy': datetime.time(20, 34, 35), 'NTN+2M BGN Curncy': datetime.time(20, 31, 30), 'NTN+3M BGN Curncy': datetime.time(20, 31, 30), 'NTN+6M BGN Curncy': datetime.time(20, 31, 30), 'NTN+9M BGN Curncy': datetime.time(18, 0, 42), 'NTN+12M BGN Curncy': datetime.time(18, 0, 42), 'NTN+2Y BGN Curncy': nan, 'NTN+3Y BGN Curncy': NaT, 'NTN+4Y BGN Curncy': NaT, 'NTN+5Y BGN Curncy': NaT, 'IHN+1W BGN Curncy': datetime.time(20, 34, 26), 'IHN+1M BGN Curncy': datetime.time(20, 33, 51), 'IHN+2M BGN Curncy': datetime.time(20, 34, 11), 'IHN+3M BGN Curncy': datetime.time(20, 33, 51), 'IHN+6M BGN Curncy': datetime.time(20, 33, 51), 'IHN+9M BGN Curncy': datetime.time(20, 0, 21), 'IHN+12M BGN Curncy': datetime.time(20, 33, 51), 'IHN+2Y BGN Curncy': datetime.time(19, 27, 44), 'IHN+3Y BGN Curncy': nan, 'IHN+4Y BGN Curncy': NaT, 'IHN+5Y BGN Curncy': NaT, 'PPN+1W BGN Curncy': datetime.time(20, 34, 11), 'PPN+1M BGN Curncy': datetime.time(20, 33, 54), 'PPN+2M BGN Curncy': datetime.time(20, 33, 54), 'PPN+3M BGN Curncy': datetime.time(20, 33, 54), 'PPN+6M BGN Curncy': datetime.time(20, 33, 54), 'PPN+9M BGN Curncy': datetime.time(19, 46, 19), 'PPN+12M BGN Curncy': datetime.time(20, 33, 54), 'PPN+2Y BGN Curncy': datetime.time(16, 5, 40), 'PPN+3Y BGN Curncy': datetime.time(20, 34, 56), 'PPN+4Y BGN Curncy': datetime.time(20, 34, 56), 'PPN+5Y BGN Curncy': datetime.time(20, 34, 56), 'CCN+1W BGN Curncy': datetime.time(20, 34, 28), 'CCN+1M BGN Curncy': datetime.time(20, 34, 28), 'CCN+2M BGN Curncy': datetime.time(20, 34, 28), 'CCN+3M BGN Curncy': datetime.time(20, 34, 28), 'CCN+6M BGN Curncy': datetime.time(20, 34, 28), 'CCN+9M BGN Curncy': datetime.time(20, 34, 28), 'CCN+12M BGN Curncy': datetime.time(20, 34, 28), 'CCN+2Y BGN Curncy': datetime.time(20, 32, 13), 'CCN+3Y BGN Curncy': datetime.time(20, 32, 40), 'CCN+4Y BGN Curncy': datetime.time(20, 32, 13), 'CCN+5Y BGN Curncy': datetime.time(20, 23, 29)}, 'SETTLEMENT_DATE_RT': {'KWN+1W BGN Curncy': datetime.datetime(2022, 1, 27, 0, 0), 'KWN+1M BGN Curncy': datetime.datetime(2022, 2, 22, 0, 0), 'KWN+2M BGN Curncy': datetime.datetime(2022, 3, 21, 0, 0), 'KWN+3M BGN Curncy': datetime.datetime(2022, 4, 20, 0, 0), 'KWN+6M BGN Curncy': datetime.datetime(2022, 7, 20, 0, 0), 'KWN+9M BGN Curncy': datetime.datetime(2022, 10, 20, 0, 0), 'KWN+12M BGN Curncy': datetime.datetime(2023, 1, 20, 0, 0), 'KWN+2Y BGN Curncy': datetime.datetime(2024, 1, 22, 0, 0), 'KWN+3Y BGN Curncy': datetime.datetime(2025, 1, 21, 0, 0), 'KWN+4Y BGN Curncy': datetime.datetime(2026, 1, 20, 0, 0), 'KWN+5Y BGN Curncy': datetime.datetime(2027, 1, 20, 0, 0), 'IRN+1W BGN Curncy': datetime.datetime(2022, 1, 27, 0, 0), 'IRN+1M BGN Curncy': datetime.datetime(2022, 2, 22, 0, 0), 'IRN+2M BGN Curncy': datetime.datetime(2022, 3, 21, 0, 0), 'IRN+3M BGN Curncy': datetime.datetime(2022, 4, 20, 0, 0), 'IRN+6M BGN Curncy': datetime.datetime(2022, 7, 20, 0, 0), 'IRN+9M BGN Curncy': datetime.datetime(2022, 10, 20, 0, 0), 'IRN+12M BGN Curncy': datetime.datetime(2023, 1, 20, 0, 0), 'IRN+2Y BGN Curncy': datetime.datetime(2024, 1, 22, 0, 0), 'IRN+3Y BGN Curncy': datetime.datetime(2025, 1, 21, 0, 0), 'IRN+4Y BGN Curncy': datetime.datetime(2026, 1, 20, 0, 0), 'IRN+5Y BGN Curncy': datetime.datetime(2027, 1, 20, 0, 0), 'NTN+1W BGN Curncy': datetime.datetime(2022, 1, 27, 0, 0), 'NTN+1M BGN Curncy': datetime.datetime(2022, 2, 22, 0, 0), 'NTN+2M BGN Curncy': datetime.datetime(2022, 3, 21, 0, 0), 'NTN+3M BGN Curncy': datetime.datetime(2022, 4, 20, 0, 0), 'NTN+6M BGN Curncy': datetime.datetime(2022, 7, 20, 0, 0), 'NTN+9M BGN Curncy': datetime.datetime(2022, 10, 20, 0, 0), 'NTN+12M BGN Curncy': datetime.datetime(2023, 1, 20, 0, 0), 'NTN+2Y BGN Curncy': datetime.datetime(2024, 1, 22, 0, 0), 'NTN+3Y BGN Curncy': datetime.datetime(2025, 1, 21, 0, 0), 'NTN+4Y BGN Curncy': datetime.datetime(2026, 1, 20, 0, 0), 'NTN+5Y BGN Curncy': datetime.datetime(2027, 1, 20, 0, 0), 'IHN+1W BGN Curncy': datetime.datetime(2022, 1, 27, 0, 0), 'IHN+1M BGN Curncy': datetime.datetime(2022, 2, 22, 0, 0), 'IHN+2M BGN Curncy': datetime.datetime(2022, 3, 21, 0, 0), 'IHN+3M BGN Curncy': datetime.datetime(2022, 4, 20, 0, 0), 'IHN+6M BGN Curncy': datetime.datetime(2022, 7, 20, 0, 0), 'IHN+9M BGN Curncy': datetime.datetime(2022, 10, 20, 0, 0), 'IHN+12M BGN Curncy': datetime.datetime(2023, 1, 20, 0, 0), 'IHN+2Y BGN Curncy': datetime.datetime(2024, 1, 22, 0, 0), 'IHN+3Y BGN Curncy': datetime.datetime(2025, 1, 21, 0, 0), 'IHN+4Y BGN Curncy': datetime.datetime(2026, 1, 20, 0, 0), 'IHN+5Y BGN Curncy': datetime.datetime(2027, 1, 20, 0, 0), 'PPN+1W BGN Curncy': datetime.datetime(2022, 1, 26, 0, 0), 'PPN+1M BGN Curncy': datetime.datetime(2022, 2, 22, 0, 0), 'PPN+2M BGN Curncy': datetime.datetime(2022, 3, 21, 0, 0), 'PPN+3M BGN Curncy': datetime.datetime(2022, 4, 19, 0, 0), 'PPN+6M BGN Curncy': datetime.datetime(2022, 7, 19, 0, 0), 'PPN+9M BGN Curncy': datetime.datetime(2022, 10, 19, 0, 0), 'PPN+12M BGN Curncy': datetime.datetime(2023, 1, 19, 0, 0), 'PPN+2Y BGN Curncy': datetime.datetime(2024, 1, 19, 0, 0), 'PPN+3Y BGN Curncy': datetime.datetime(2025, 1, 21, 0, 0), 'PPN+4Y BGN Curncy': datetime.datetime(2026, 1, 20, 0, 0), 'PPN+5Y BGN Curncy': datetime.datetime(2027, 1, 19, 0, 0), 'CCN+1W BGN Curncy': datetime.datetime(2022, 1, 27, 0, 0), 'CCN+1M BGN Curncy': datetime.datetime(2022, 2, 22, 0, 0), 'CCN+2M BGN Curncy': datetime.datetime(2022, 3, 21, 0, 0), 'CCN+3M BGN Curncy': datetime.datetime(2022, 4, 20, 0, 0), 'CCN+6M BGN Curncy': datetime.datetime(2022, 7, 20, 0, 0), 'CCN+9M BGN Curncy': datetime.datetime(2022, 10, 20, 0, 0), 'CCN+12M BGN Curncy': datetime.datetime(2023, 1, 20, 0, 0), 'CCN+2Y BGN Curncy': datetime.datetime(2024, 1, 22, 0, 0), 'CCN+3Y BGN Curncy': datetime.datetime(2025, 1, 21, 0, 0), 'CCN+4Y BGN Curncy': datetime.datetime(2026, 1, 20, 0, 0), 'CCN+5Y BGN Curncy': datetime.datetime(2027, 1, 20, 0, 0)}}
似乎 groupby
然后插值应该更快。不幸的是,当我 运行 你的代码时,我实际上并没有得到你列出的“过滤插值 DF”(也许你遗漏了插值的某些部分,你指定它应该是 15 分钟的间隔?) .如果您使用 str.startswith
而不是 str[:3]
:
%%timeit
for ccy in ccy_prefix:
df[df.index.str[:3]==ccy] = df[df.index.str[:3]==ccy].interpolate(limit_direction='forward')
% 10 loops, best of 5: 25.9 ms per loop
相对于:
%%timeit
for ccy in ccy_prefix:
df[df.index.str.startswith(ccy)] = df[df.index.str.startswith(ccy)].interpolate(limit_direction='forward')
% 10 loops, best of 5: 24.1 ms per loop
也许更好的解决方案是创建一个带有货币前缀的新列,然后是 groupby
和 interpolate
,根据提供的评论
df['ccy_prefix'] = df.index.str[:3]
def interpolator(df):
return(df.interpolate(limit_direction='forward'))
那么这应该是其中最快的:
df = df.groupby('ccy_prefix').apply(interpolator)