将相关性和波动性数据框与多指标相乘以获得协方差矩阵
Multiply Correlation and Volatility Dataframes with Multi-Index to Get Covariance Matrix
我将波动率数据帧 (rvm) 与相关性数据帧 (omega_tilde) 相乘以获得协方差矩阵。
rvm DataFrame(5790行×10列):
NoDur Durbl Manuf Enrgy HiTec Telcm Shops Hlth Utils Other
Date lvl1
1972-11-30 NoDur 0.006660 0 0 0 0 0 0 0 0
Durbl 0 0.00939 0 0 0 0 0 0 0 0
Manuf 0 0 0.00803 0 0 0 0 0 0 0
Enrgy 0 0 0 0.00851 0 0 0 0 0 0
HiTec 0 0 0 0 0.01205 0 0 0 0 0
Telcm 0 0 0 0 0 0.00799 0 0 0 0
Shops 0 0 0 0 0 0 0.00795 0 0 0
Hlth 0 0 0 0 0 0 0 0.00819 0 0
Utils 0 0 0 0 0 0 0 0 0.00505 0
Other 0 0 0 0 0 0 0 0 0 0.00892
1972-11-31 NoDur 0.006640 0 0 0 0 0 0 0 0
Durbl 0 0.00943 0 0 0 0 0 0 0 0
Manuf 0 0 0.00800 0 0 0 0 0 0 0
Enrgy 0 0 0 0.00837 0 0 0 0 0 0
HiTec 0 0 0 0 0.01185 0 0 0 0 0
Telcm 0 0 0 0 0 0.00792 0 0 0 0
Shops 0 0 0 0 0 0 0.00794 0 0 0
Hlth 0 0 0 0 0 0 0 0.00804 0 0
Utils 0 0 0 0 0 0 0 0 0.00504 0
Other 0 0 0 0 0 0 0 0 0 0.00889
omega_tildeDataFrame(5790行×10列):
NoDur Durbl Manuf Enrgy HiTec Telcm Shops Hlth Utils Other
Date level_1
2021-01-31 NoDur 1.00000 0.62369 0.87367 0.65322 0.74356 0.84011 0.77417 0.80183 0.82833 0.84094
Durbl 0.62369 1.00000 0.69965 0.57501 0.70125 0.60104 0.68652 0.61333 0.45301 0.70556
Manuf 0.87367 0.69965 1.00000 0.78599 0.81415 0.84477 0.80932 0.82127 0.74803 0.94673
Enrgy 0.65322 0.57501 0.78599 1.00000 0.59940 0.67492 0.58058 0.61946 0.57830 0.81593
HiTec 0.74356 0.70125 0.81415 0.59940 1.00000 0.75436 0.91318 0.84508 0.59302 0.81109
Telcm 0.84011 0.60104 0.84477 0.67492 0.75436 1.00000 0.77555 0.77342 0.73186 0.85595
Shops 0.77417 0.68652 0.80932 0.58058 0.91318 0.77555 1.00000 0.81197 0.61574 0.79932
Hlth 0.80183 0.61333 0.82127 0.61946 0.84508 0.77342 0.81197 1.00000 0.70032 0.80875
Utils 0.82833 0.45301 0.74803 0.57830 0.59302 0.73186 0.61574 0.70032 1.00000 0.72739
Other 0.84094 0.70556 0.94673 0.81593 0.81109 0.85595 0.79932 0.80875 0.72739 1.00000
2021-02-28 NoDur 1.00000 0.61544 0.87041 0.64622 0.73941 0.83792 0.77075 0.79993 0.82813 0.83937
Durbl 0.61544 1.00000 0.69464 0.55865 0.70203 0.59109 0.68265 0.60963 0.44792 0.69685
Manuf 0.87041 0.69464 1.00000 0.78243 0.81121 0.84189 0.80395 0.81809 0.74489 0.94605
Enrgy 0.64622 0.55865 0.78243 1.00000 0.58911 0.67134 0.56925 0.61252 0.56865 0.81365
HiTec 0.73941 0.70203 0.81121 0.58911 1.00000 0.74904 0.91274 0.84179 0.58973 0.80581
Telcm 0.83792 0.59109 0.84189 0.67134 0.74904 1.00000 0.77078 0.76844 0.72814 0.85493
Shops 0.77075 0.68265 0.80395 0.56925 0.91274 0.77078 1.00000 0.80924 0.61446 0.79342
Hlth 0.79993 0.60963 0.81809 0.61252 0.84179 0.76844 0.80924 1.00000 0.69965 0.80394
Utils 0.82813 0.44792 0.74489 0.56865 0.58973 0.72814 0.61446 0.69965 1.00000 0.72542
Other 0.83937 0.69685 0.94605 0.81365 0.80581 0.85493 0.79342 0.80394 0.72542 1.00000
我试过的代码:
sigma_tilde = omega_tilde.groupby(level='Date').apply(lambda g: rvm_diag.loc[g.name].dot(g.values@(rvm_diag.loc[g.name])))
我得到的错误:
ValueError: matrices are not aligned
编辑:
我还尝试了以下方法:
reshaped = omega_tilde.values.reshape(omega_tilde.index.levels[0].nunique(), omega_tilde.index.levels[1].nunique(), omega_tilde.shape[-1])
np.einsum('ijk,ik->ijk', rvm_diag.values, np.einsum('ijk,ik->ij', reshaped, rvm_diag.values))
这里的错误:
ValueError: operands could not be broadcast together with remapped shapes [original->remapped]: (579,10,10)->(579,10,10) (5790,10)->(5790,newaxis,10)
我想要的输出与 omega_tilde DataFrame 的格式相同,所以每天都有一个矩阵。
感谢任何帮助。谢谢!
为您提供 ValueError: matrices are not aligned
的代码只需添加 .values
即可使矩阵乘法正常工作,您可以对两个乘法步骤使用 @
,以便返回 DataFrame。
sigma_tilde = (
omega_tilde
.groupby(level='Date')
.apply(lambda g: rvm.loc[g.name].values@(g.values@(rvm.loc[g.name]))
)
# additional step to change the second level of index
sigma_tilde.index.set_levels(omega_tilde.columns, 1, inplace=True)
在一个较小的示例中(上面的 DF 的左上 3x3 象限,但两个月的值相同并且两个 DF 使用相同的两个月):
omega_tilde = pd.DataFrame(
np.array(
[[1.00000, 0.62369, 0.87367],
[0.62369, 1.00000, 0.69965],
[0.87367, 0.69965, 1.00000],
[1.00000, 0.62369, 0.87367],
[0.62369, 1.00000, 0.69965],
[0.87367, 0.69965, 1.00000]]
),
index = pd.MultiIndex.from_arrays(
[[pd.Timestamp('2021-01-31'), pd.Timestamp('2021-01-31'),
pd.Timestamp('2021-01-31'), pd.Timestamp("2021-02-28"),
pd.Timestamp("2021-02-28"), pd.Timestamp("2021-02-28")],
['NoDur', 'Durbl', 'Manuf']*2],
names=['Date', 'level_1']
),
columns = ['NoDur', 'Durbl', 'Manuf']
)
rvm = pd.DataFrame(
np.array(
[[0.00666, 0, 0],
[0, 0.00939, 0],
[0, 0, 0.00803],
[0.00666, 0, 0],
[0, 0.00939, 0],
[0, 0, 0.00803]]
),
index = pd.MultiIndex.from_arrays(
[[pd.Timestamp('2021-01-31'), pd.Timestamp('2021-01-31'),
pd.Timestamp('2021-01-31'), pd.Timestamp("2021-02-28"),
pd.Timestamp("2021-02-28"), pd.Timestamp("2021-02-28")],
['NoDur', 'Durbl', 'Manuf']*2],
names=['Date', 'level_1']
),
columns = ['NoDur', 'Durbl', 'Manuf']
)
乘法代码将产生:
level_1 NoDur Durbl Manuf
2021-01-31 NoDur 0.000044 0.000039 0.000047
Durbl 0.000039 0.000088 0.000053
Manuf 0.000047 0.000053 0.000064
2021-02-28 NoDur 0.000044 0.000039 0.000047
Durbl 0.000039 0.000088 0.000053
Manuf 0.000047 0.000053 0.000064
我将波动率数据帧 (rvm) 与相关性数据帧 (omega_tilde) 相乘以获得协方差矩阵。
rvm DataFrame(5790行×10列):
NoDur Durbl Manuf Enrgy HiTec Telcm Shops Hlth Utils Other
Date lvl1
1972-11-30 NoDur 0.006660 0 0 0 0 0 0 0 0
Durbl 0 0.00939 0 0 0 0 0 0 0 0
Manuf 0 0 0.00803 0 0 0 0 0 0 0
Enrgy 0 0 0 0.00851 0 0 0 0 0 0
HiTec 0 0 0 0 0.01205 0 0 0 0 0
Telcm 0 0 0 0 0 0.00799 0 0 0 0
Shops 0 0 0 0 0 0 0.00795 0 0 0
Hlth 0 0 0 0 0 0 0 0.00819 0 0
Utils 0 0 0 0 0 0 0 0 0.00505 0
Other 0 0 0 0 0 0 0 0 0 0.00892
1972-11-31 NoDur 0.006640 0 0 0 0 0 0 0 0
Durbl 0 0.00943 0 0 0 0 0 0 0 0
Manuf 0 0 0.00800 0 0 0 0 0 0 0
Enrgy 0 0 0 0.00837 0 0 0 0 0 0
HiTec 0 0 0 0 0.01185 0 0 0 0 0
Telcm 0 0 0 0 0 0.00792 0 0 0 0
Shops 0 0 0 0 0 0 0.00794 0 0 0
Hlth 0 0 0 0 0 0 0 0.00804 0 0
Utils 0 0 0 0 0 0 0 0 0.00504 0
Other 0 0 0 0 0 0 0 0 0 0.00889
omega_tildeDataFrame(5790行×10列):
NoDur Durbl Manuf Enrgy HiTec Telcm Shops Hlth Utils Other
Date level_1
2021-01-31 NoDur 1.00000 0.62369 0.87367 0.65322 0.74356 0.84011 0.77417 0.80183 0.82833 0.84094
Durbl 0.62369 1.00000 0.69965 0.57501 0.70125 0.60104 0.68652 0.61333 0.45301 0.70556
Manuf 0.87367 0.69965 1.00000 0.78599 0.81415 0.84477 0.80932 0.82127 0.74803 0.94673
Enrgy 0.65322 0.57501 0.78599 1.00000 0.59940 0.67492 0.58058 0.61946 0.57830 0.81593
HiTec 0.74356 0.70125 0.81415 0.59940 1.00000 0.75436 0.91318 0.84508 0.59302 0.81109
Telcm 0.84011 0.60104 0.84477 0.67492 0.75436 1.00000 0.77555 0.77342 0.73186 0.85595
Shops 0.77417 0.68652 0.80932 0.58058 0.91318 0.77555 1.00000 0.81197 0.61574 0.79932
Hlth 0.80183 0.61333 0.82127 0.61946 0.84508 0.77342 0.81197 1.00000 0.70032 0.80875
Utils 0.82833 0.45301 0.74803 0.57830 0.59302 0.73186 0.61574 0.70032 1.00000 0.72739
Other 0.84094 0.70556 0.94673 0.81593 0.81109 0.85595 0.79932 0.80875 0.72739 1.00000
2021-02-28 NoDur 1.00000 0.61544 0.87041 0.64622 0.73941 0.83792 0.77075 0.79993 0.82813 0.83937
Durbl 0.61544 1.00000 0.69464 0.55865 0.70203 0.59109 0.68265 0.60963 0.44792 0.69685
Manuf 0.87041 0.69464 1.00000 0.78243 0.81121 0.84189 0.80395 0.81809 0.74489 0.94605
Enrgy 0.64622 0.55865 0.78243 1.00000 0.58911 0.67134 0.56925 0.61252 0.56865 0.81365
HiTec 0.73941 0.70203 0.81121 0.58911 1.00000 0.74904 0.91274 0.84179 0.58973 0.80581
Telcm 0.83792 0.59109 0.84189 0.67134 0.74904 1.00000 0.77078 0.76844 0.72814 0.85493
Shops 0.77075 0.68265 0.80395 0.56925 0.91274 0.77078 1.00000 0.80924 0.61446 0.79342
Hlth 0.79993 0.60963 0.81809 0.61252 0.84179 0.76844 0.80924 1.00000 0.69965 0.80394
Utils 0.82813 0.44792 0.74489 0.56865 0.58973 0.72814 0.61446 0.69965 1.00000 0.72542
Other 0.83937 0.69685 0.94605 0.81365 0.80581 0.85493 0.79342 0.80394 0.72542 1.00000
我试过的代码:
sigma_tilde = omega_tilde.groupby(level='Date').apply(lambda g: rvm_diag.loc[g.name].dot(g.values@(rvm_diag.loc[g.name])))
我得到的错误:
ValueError: matrices are not aligned
编辑: 我还尝试了以下方法:
reshaped = omega_tilde.values.reshape(omega_tilde.index.levels[0].nunique(), omega_tilde.index.levels[1].nunique(), omega_tilde.shape[-1])
np.einsum('ijk,ik->ijk', rvm_diag.values, np.einsum('ijk,ik->ij', reshaped, rvm_diag.values))
这里的错误:
ValueError: operands could not be broadcast together with remapped shapes [original->remapped]: (579,10,10)->(579,10,10) (5790,10)->(5790,newaxis,10)
我想要的输出与 omega_tilde DataFrame 的格式相同,所以每天都有一个矩阵。
感谢任何帮助。谢谢!
为您提供 ValueError: matrices are not aligned
的代码只需添加 .values
即可使矩阵乘法正常工作,您可以对两个乘法步骤使用 @
,以便返回 DataFrame。
sigma_tilde = (
omega_tilde
.groupby(level='Date')
.apply(lambda g: rvm.loc[g.name].values@(g.values@(rvm.loc[g.name]))
)
# additional step to change the second level of index
sigma_tilde.index.set_levels(omega_tilde.columns, 1, inplace=True)
在一个较小的示例中(上面的 DF 的左上 3x3 象限,但两个月的值相同并且两个 DF 使用相同的两个月):
omega_tilde = pd.DataFrame(
np.array(
[[1.00000, 0.62369, 0.87367],
[0.62369, 1.00000, 0.69965],
[0.87367, 0.69965, 1.00000],
[1.00000, 0.62369, 0.87367],
[0.62369, 1.00000, 0.69965],
[0.87367, 0.69965, 1.00000]]
),
index = pd.MultiIndex.from_arrays(
[[pd.Timestamp('2021-01-31'), pd.Timestamp('2021-01-31'),
pd.Timestamp('2021-01-31'), pd.Timestamp("2021-02-28"),
pd.Timestamp("2021-02-28"), pd.Timestamp("2021-02-28")],
['NoDur', 'Durbl', 'Manuf']*2],
names=['Date', 'level_1']
),
columns = ['NoDur', 'Durbl', 'Manuf']
)
rvm = pd.DataFrame(
np.array(
[[0.00666, 0, 0],
[0, 0.00939, 0],
[0, 0, 0.00803],
[0.00666, 0, 0],
[0, 0.00939, 0],
[0, 0, 0.00803]]
),
index = pd.MultiIndex.from_arrays(
[[pd.Timestamp('2021-01-31'), pd.Timestamp('2021-01-31'),
pd.Timestamp('2021-01-31'), pd.Timestamp("2021-02-28"),
pd.Timestamp("2021-02-28"), pd.Timestamp("2021-02-28")],
['NoDur', 'Durbl', 'Manuf']*2],
names=['Date', 'level_1']
),
columns = ['NoDur', 'Durbl', 'Manuf']
)
乘法代码将产生:
level_1 NoDur Durbl Manuf
2021-01-31 NoDur 0.000044 0.000039 0.000047
Durbl 0.000039 0.000088 0.000053
Manuf 0.000047 0.000053 0.000064
2021-02-28 NoDur 0.000044 0.000039 0.000047
Durbl 0.000039 0.000088 0.000053
Manuf 0.000047 0.000053 0.000064