手动计算波动率与内置函数不一样
Calculating volatility manually vs built-in functions are not the same
有人可以帮助我了解我错在哪里吗?我不知道为什么每列的波动率都不同...
这是我的代码示例:
from math import sqrt
from numpy import around
from numpy.random import uniform
from pandas import DataFrame
from statistics import stdev
data = around(a=uniform(low=1.0, high=50.0, size=(500, 1)), decimals=3)
df = DataFrame(data=data, columns=['close'], dtype='float64')
df.loc[:, 'delta'] = df.loc[:, 'close'].pct_change().fillna(0).round(3)
volatility = []
for index in range(df.shape[0]):
if index < 90:
volatility.append(0)
else:
start = index - 90
stop = index + 1
volatility.append(stdev(df.loc[start:stop, 'delta']) * sqrt(252))
df.loc[:, 'volatility1'] = volatility
df.loc[:, 'volatility2'] = df.loc[:, 'delta'].rolling(window=90).std(ddof=0) * sqrt(252)
print(df)
close delta volatility1 volatility2
0 10.099 0.000 0.000000 NaN
1 26.331 1.607 0.000000 NaN
2 32.361 0.229 0.000000 NaN
3 2.068 -0.936 0.000000 NaN
4 36.241 16.525 0.000000 NaN
.. ... ... ... ...
495 48.015 -0.029 46.078037 46.132943
496 6.988 -0.854 46.036210 46.178820
497 23.331 2.339 46.003184 45.837245
498 25.551 0.095 45.608260 45.792188
499 46.248 0.810 45.793012 45.769787
[500 rows x 4 columns]
非常感谢!
需要进行三处小改动。添加内联评论。 89 是必需的,因为包含端点(与许多其他 python 不同)。 ddof=1
是必需的,因为 stdev 默认使用它。 This article 谈论的是 numpy std 而不是 stdev,但 ddof 所做的理论仍然是一样的。
此外,将来尝试将大小更改为 95 之类的值。调试时不需要其他 405 行,很高兴看到从 0/NaN 到实际波动率的转换,您需要 89不是 90.
0 与 NaN 的区别仍然存在。这是您附加 0 和滚动的默认行为的结果。我不确定那是不是故意的,所以我离开了。
from math import sqrt
from numpy import around
from numpy.random import uniform
from pandas import DataFrame
from statistics import stdev
data = around(a=uniform(low=1.0, high=50.0, size=(500, 1)), decimals=3)
df = DataFrame(data=data, columns=['close'], dtype='float64')
df['delta'] = df['close'].pct_change().fillna(0).round(3)
volatility = []
for index in range(df.shape[0]):
if index < 89: #change to 89
volatility.append(0)
else:
start = index - 89 #change to 89
stop = index
volatility.append(stdev(df.loc[start:stop, 'delta']) * sqrt(252))
df['volatility1'] = volatility
df['volatility2'] = df.loc[:, 'delta'].rolling(window=90).std(ddof=1) * sqrt(252) #change to ddof=1
print(df)
有人可以帮助我了解我错在哪里吗?我不知道为什么每列的波动率都不同...
这是我的代码示例:
from math import sqrt
from numpy import around
from numpy.random import uniform
from pandas import DataFrame
from statistics import stdev
data = around(a=uniform(low=1.0, high=50.0, size=(500, 1)), decimals=3)
df = DataFrame(data=data, columns=['close'], dtype='float64')
df.loc[:, 'delta'] = df.loc[:, 'close'].pct_change().fillna(0).round(3)
volatility = []
for index in range(df.shape[0]):
if index < 90:
volatility.append(0)
else:
start = index - 90
stop = index + 1
volatility.append(stdev(df.loc[start:stop, 'delta']) * sqrt(252))
df.loc[:, 'volatility1'] = volatility
df.loc[:, 'volatility2'] = df.loc[:, 'delta'].rolling(window=90).std(ddof=0) * sqrt(252)
print(df)
close delta volatility1 volatility2
0 10.099 0.000 0.000000 NaN
1 26.331 1.607 0.000000 NaN
2 32.361 0.229 0.000000 NaN
3 2.068 -0.936 0.000000 NaN
4 36.241 16.525 0.000000 NaN
.. ... ... ... ...
495 48.015 -0.029 46.078037 46.132943
496 6.988 -0.854 46.036210 46.178820
497 23.331 2.339 46.003184 45.837245
498 25.551 0.095 45.608260 45.792188
499 46.248 0.810 45.793012 45.769787
[500 rows x 4 columns]
非常感谢!
需要进行三处小改动。添加内联评论。 89 是必需的,因为包含端点(与许多其他 python 不同)。 ddof=1
是必需的,因为 stdev 默认使用它。 This article 谈论的是 numpy std 而不是 stdev,但 ddof 所做的理论仍然是一样的。
此外,将来尝试将大小更改为 95 之类的值。调试时不需要其他 405 行,很高兴看到从 0/NaN 到实际波动率的转换,您需要 89不是 90.
0 与 NaN 的区别仍然存在。这是您附加 0 和滚动的默认行为的结果。我不确定那是不是故意的,所以我离开了。
from math import sqrt
from numpy import around
from numpy.random import uniform
from pandas import DataFrame
from statistics import stdev
data = around(a=uniform(low=1.0, high=50.0, size=(500, 1)), decimals=3)
df = DataFrame(data=data, columns=['close'], dtype='float64')
df['delta'] = df['close'].pct_change().fillna(0).round(3)
volatility = []
for index in range(df.shape[0]):
if index < 89: #change to 89
volatility.append(0)
else:
start = index - 89 #change to 89
stop = index
volatility.append(stdev(df.loc[start:stop, 'delta']) * sqrt(252))
df['volatility1'] = volatility
df['volatility2'] = df.loc[:, 'delta'].rolling(window=90).std(ddof=1) * sqrt(252) #change to ddof=1
print(df)