计算 pandas 中的尾随方差
Calculate trailing variance in pandas
我有一个如下所示的数据框:
| symbol | date | close
----|--------|------------|----------
0 | APX | 5/31/2017 | 4.04
1 | APX | 6/30/2017 | 5.4
2 | APX | 7/31/2017 | 4.15
3 | APX | 8/31/2017 | 9.95
4 | APX | 9/30/2017 | 10.3
5 | APX | 10/31/2017 | 5.58
6 | APX | 11/30/2017 | 8.47
7 | APX | 12/31/2017 | 15.66
8 | APX | 1/31/2018 | 10.55
9 | APX | 2/28/2018 | 9.8
10 | APX | 3/31/2018 | 7.43
11 | APX | 4/30/2018 | 8.93
12 | APX | 5/31/2018 | 7.61
13 | APX | 6/30/2018 | 7.79
14 | AURA | 1/31/2018 | 0.221382
15 | AURA | 2/28/2018 | 0.222236
16 | AURA | 3/31/2018 | 0.075488
17 | AURA | 4/30/2018 | 0.180699
18 | AURA | 5/31/2018 | 0.220009
19 | AURA | 6/30/2018 | 0.199029
20 | BASH | 11/30/2016 | 0.000447
21 | BASH | 12/31/2016 | 0.000376
22 | BASH | 1/31/2017 | 0.000452
23 | BASH | 2/28/2017 | 0.000414
24 | BASH | 3/31/2017 | 0.00045
25 | BASH | 4/30/2017 | 0.000754
26 | BASH | 5/31/2017 | 0.009115
27 | BASH | 6/30/2017 | 0.03419
28 | BASH | 7/31/2017 | 0.014037
29 | BASH | 8/31/2017 | 0.009117
30 | BASH | 9/30/2017 | 0.002333
31 | BASH | 10/31/2017 | 0.00258
32 | BASH | 11/30/2017 | 0.003415
33 | BASH | 12/31/2017 | 0.003756
34 | BASH | 1/31/2018 | 0.005454
35 | BASH | 2/28/2018 | 0.006186
36 | BASH | 3/31/2018 | 0.004155
37 | BASH | 4/30/2018 | 0.005078
38 | BASH | 5/31/2018 | 0.003696
39 | BASH | 6/30/2018 | 0.003442
我想为每个符号计算 6 个月的尾随方差,并将其作为新列添加到数据框中。应根据 close
列中的值计算方差。
例如,对于 APX 有 14 个观测值,因此第一个方差应根据值 4.04、5.4、4.15、9.95、10.3 和 5.58 计算。
接下来要根据5.4、4.15、9.95、10.3、5.58、8.47等计算方差
我假设我需要使用 df.var
函数来计算方差,但我如何告诉我如何计算每个符号的尾随 6 个月?
您可以将 groupby
和 rolling(6)
与 var()
一起使用,以获得每个组分离的数据中前 6 个观察值的滚动方差。将 min_periods
设置为 6 将强制函数使用至少 6 个值进行计算,如果不设置此值,前 5 个结果将使用较少数量的观察值。
df['trailing_var'] = df.groupby('symbol')['close'].rolling(6, min_periods=6).var().reset_index(drop=True)
结果:
symbol date close trailing_var
0 APX 5/31/2017 4.040000 NaN
1 APX 6/30/2017 5.400000 NaN
2 APX 7/31/2017 4.150000 NaN
3 APX 8/31/2017 9.950000 NaN
4 APX 9/30/2017 10.30000 NaN
5 APX 10/31/2017 5.580000 7.988720e+00
6 APX 11/30/2017 8.470000 6.776377e+00
7 APX 12/31/2017 15.66000 1.648918e+01
8 APX 1/31/2018 10.55000 1.085291e+01
9 APX 2/28/2018 9.800000 1.086476e+01
10 APX 3/31/2018 7.430000 1.196206e+01
11 APX 4/30/2018 8.930000 8.470240e+00
12 APX 5/31/2018 7.610000 9.167987e+00
13 APX 6/30/2018 7.790000 1.662630e+00
14 AURA 1/31/2018 0.221382 NaN
15 AURA 2/28/2018 0.222236 NaN
16 AURA 3/31/2018 0.075488 NaN
17 AURA 4/30/2018 0.180699 NaN
18 AURA 5/31/2018 0.220009 NaN
19 AURA 6/30/2018 0.199029 3.226191e-03
20 BASH 11/30/2016 0.000447 NaN
21 BASH 12/31/2016 0.000376 NaN
22 BASH 1/31/2017 0.000452 NaN
23 BASH 2/28/2017 0.000414 NaN
24 BASH 3/31/2017 0.000450 NaN
25 BASH 4/30/2017 0.000754 1.859857e-08
26 BASH 5/31/2017 0.009115 1.241904e-05
27 BASH 6/30/2017 0.034190 1.820075e-04
28 BASH 7/31/2017 0.014037 1.741278e-04
29 BASH 8/31/2017 0.009117 1.539841e-04
30 BASH 9/30/2017 0.002333 1.464200e-04
31 BASH 10/31/2017 0.002580 1.390604e-04
32 BASH 11/30/2017 0.003415 1.508145e-04
33 BASH 12/31/2017 0.003756 2.221467e-05
34 BASH 1/31/2018 0.005454 6.464003e-06
35 BASH 2/28/2018 0.006186 2.415413e-06
36 BASH 3/31/2018 0.004155 1.787309e-06
37 BASH 4/30/2018 0.005078 1.150985e-06
38 BASH 5/31/2018 0.003696 1.022634e-06
39 BASH 6/30/2018 0.003442 1.160249e-06
我有一个如下所示的数据框:
| symbol | date | close
----|--------|------------|----------
0 | APX | 5/31/2017 | 4.04
1 | APX | 6/30/2017 | 5.4
2 | APX | 7/31/2017 | 4.15
3 | APX | 8/31/2017 | 9.95
4 | APX | 9/30/2017 | 10.3
5 | APX | 10/31/2017 | 5.58
6 | APX | 11/30/2017 | 8.47
7 | APX | 12/31/2017 | 15.66
8 | APX | 1/31/2018 | 10.55
9 | APX | 2/28/2018 | 9.8
10 | APX | 3/31/2018 | 7.43
11 | APX | 4/30/2018 | 8.93
12 | APX | 5/31/2018 | 7.61
13 | APX | 6/30/2018 | 7.79
14 | AURA | 1/31/2018 | 0.221382
15 | AURA | 2/28/2018 | 0.222236
16 | AURA | 3/31/2018 | 0.075488
17 | AURA | 4/30/2018 | 0.180699
18 | AURA | 5/31/2018 | 0.220009
19 | AURA | 6/30/2018 | 0.199029
20 | BASH | 11/30/2016 | 0.000447
21 | BASH | 12/31/2016 | 0.000376
22 | BASH | 1/31/2017 | 0.000452
23 | BASH | 2/28/2017 | 0.000414
24 | BASH | 3/31/2017 | 0.00045
25 | BASH | 4/30/2017 | 0.000754
26 | BASH | 5/31/2017 | 0.009115
27 | BASH | 6/30/2017 | 0.03419
28 | BASH | 7/31/2017 | 0.014037
29 | BASH | 8/31/2017 | 0.009117
30 | BASH | 9/30/2017 | 0.002333
31 | BASH | 10/31/2017 | 0.00258
32 | BASH | 11/30/2017 | 0.003415
33 | BASH | 12/31/2017 | 0.003756
34 | BASH | 1/31/2018 | 0.005454
35 | BASH | 2/28/2018 | 0.006186
36 | BASH | 3/31/2018 | 0.004155
37 | BASH | 4/30/2018 | 0.005078
38 | BASH | 5/31/2018 | 0.003696
39 | BASH | 6/30/2018 | 0.003442
我想为每个符号计算 6 个月的尾随方差,并将其作为新列添加到数据框中。应根据 close
列中的值计算方差。
例如,对于 APX 有 14 个观测值,因此第一个方差应根据值 4.04、5.4、4.15、9.95、10.3 和 5.58 计算。
接下来要根据5.4、4.15、9.95、10.3、5.58、8.47等计算方差
我假设我需要使用 df.var
函数来计算方差,但我如何告诉我如何计算每个符号的尾随 6 个月?
您可以将 groupby
和 rolling(6)
与 var()
一起使用,以获得每个组分离的数据中前 6 个观察值的滚动方差。将 min_periods
设置为 6 将强制函数使用至少 6 个值进行计算,如果不设置此值,前 5 个结果将使用较少数量的观察值。
df['trailing_var'] = df.groupby('symbol')['close'].rolling(6, min_periods=6).var().reset_index(drop=True)
结果:
symbol date close trailing_var
0 APX 5/31/2017 4.040000 NaN
1 APX 6/30/2017 5.400000 NaN
2 APX 7/31/2017 4.150000 NaN
3 APX 8/31/2017 9.950000 NaN
4 APX 9/30/2017 10.30000 NaN
5 APX 10/31/2017 5.580000 7.988720e+00
6 APX 11/30/2017 8.470000 6.776377e+00
7 APX 12/31/2017 15.66000 1.648918e+01
8 APX 1/31/2018 10.55000 1.085291e+01
9 APX 2/28/2018 9.800000 1.086476e+01
10 APX 3/31/2018 7.430000 1.196206e+01
11 APX 4/30/2018 8.930000 8.470240e+00
12 APX 5/31/2018 7.610000 9.167987e+00
13 APX 6/30/2018 7.790000 1.662630e+00
14 AURA 1/31/2018 0.221382 NaN
15 AURA 2/28/2018 0.222236 NaN
16 AURA 3/31/2018 0.075488 NaN
17 AURA 4/30/2018 0.180699 NaN
18 AURA 5/31/2018 0.220009 NaN
19 AURA 6/30/2018 0.199029 3.226191e-03
20 BASH 11/30/2016 0.000447 NaN
21 BASH 12/31/2016 0.000376 NaN
22 BASH 1/31/2017 0.000452 NaN
23 BASH 2/28/2017 0.000414 NaN
24 BASH 3/31/2017 0.000450 NaN
25 BASH 4/30/2017 0.000754 1.859857e-08
26 BASH 5/31/2017 0.009115 1.241904e-05
27 BASH 6/30/2017 0.034190 1.820075e-04
28 BASH 7/31/2017 0.014037 1.741278e-04
29 BASH 8/31/2017 0.009117 1.539841e-04
30 BASH 9/30/2017 0.002333 1.464200e-04
31 BASH 10/31/2017 0.002580 1.390604e-04
32 BASH 11/30/2017 0.003415 1.508145e-04
33 BASH 12/31/2017 0.003756 2.221467e-05
34 BASH 1/31/2018 0.005454 6.464003e-06
35 BASH 2/28/2018 0.006186 2.415413e-06
36 BASH 3/31/2018 0.004155 1.787309e-06
37 BASH 4/30/2018 0.005078 1.150985e-06
38 BASH 5/31/2018 0.003696 1.022634e-06
39 BASH 6/30/2018 0.003442 1.160249e-06