如何在熊猫数据框中提前 returns 计算一天?
How can I compute one day ahead returns in a panda dataframe?
我一直在做一个项目,我试图计算一段时间内以百分比表示的对数 returns。
我已经将所有每日调整后的收盘价存储在一个熊猫数据框中,如下所示:
{'SP500': {Timestamp('2009-12-31 00:00:00'): 1115.0999755859375,
Timestamp('2010-01-04 00:00:00'): 1132.989990234375,
Timestamp('2010-01-05 00:00:00'): 1136.52001953125,
Timestamp('2010-01-06 00:00:00'): 1137.1400146484375,
Timestamp('2010-01-07 00:00:00'): 1141.68994140625},
'A': {Timestamp('2009-12-31 00:00:00'): 20.28476333618164,
Timestamp('2010-01-04 00:00:00'): 20.43492889404297,
Timestamp('2010-01-05 00:00:00'): 20.21295928955078,
Timestamp('2010-01-06 00:00:00'): 20.141132354736328,
Timestamp('2010-01-07 00:00:00'): 20.11502456665039},
'AAL': {Timestamp('2009-12-31 00:00:00'): 4.562869548797607,
Timestamp('2010-01-04 00:00:00'): 4.496876239776611,
Timestamp('2010-01-05 00:00:00'): 5.005957126617432,
Timestamp('2010-01-06 00:00:00'): 4.79855489730835,
Timestamp('2010-01-07 00:00:00'): 4.93996524810791},
'AAP': {Timestamp('2009-12-31 00:00:00'): 38.3176383972168,
Timestamp('2010-01-04 00:00:00'): 38.22296905517578,
Timestamp('2010-01-05 00:00:00'): 37.99578857421875,
Timestamp('2010-01-06 00:00:00'): 38.32709884643555,
Timestamp('2010-01-07 00:00:00'): 38.3176383972168},
'AAPL': {Timestamp('2009-12-31 00:00:00'): 6.471692085266113,
Timestamp('2010-01-04 00:00:00'): 6.572423458099365,
Timestamp('2010-01-05 00:00:00'): 6.583786487579346,
Timestamp('2010-01-06 00:00:00'): 6.479064464569092,
Timestamp('2010-01-07 00:00:00'): 6.467087268829346}}
我认为每天的 return 定义为:在第 t 天,return 将是第 t 天的 log return 减去 log return 第 t-1 天。我应用了这行代码:
for i in df.columns:
df[i] = np.log(df[i]) - np.log(df[i].shift(1))
我已经检查过了,它给了我预期的结果是:
rti = ln(AdjClosingPrice)t - ln(AdjClosingPrice)t -1
每列:
{'SP500': {Timestamp('2009-12-31 00:00:00'): nan,
Timestamp('2010-01-04 00:00:00'): 0.015916082167126255,
Timestamp('2010-01-05 00:00:00'): 0.003110831966759875,
Timestamp('2010-01-06 00:00:00'): 0.0005453718878234426,
Timestamp('2010-01-07 00:00:00'): 0.003993218354654715},
'A': {Timestamp('2009-12-31 00:00:00'): nan,
Timestamp('2010-01-04 00:00:00'): 0.007375607740701007,
Timestamp('2010-01-05 00:00:00'): -0.010921689703742743,
Timestamp('2010-01-06 00:00:00'): -0.003559837812704636,
Timestamp('2010-01-07 00:00:00'): -0.001297083166547086},
'AAL': {Timestamp('2009-12-31 00:00:00'): nan,
Timestamp('2010-01-04 00:00:00'): -0.014568725834338547,
Timestamp('2010-01-05 00:00:00'): 0.10724564178274565,
Timestamp('2010-01-06 00:00:00'): -0.042313819049169865,
Timestamp('2010-01-07 00:00:00'): 0.029043486854613887},
'AAP': {Timestamp('2009-12-31 00:00:00'): nan,
Timestamp('2010-01-04 00:00:00'): -0.0024737036578925675,
Timestamp('2010-01-05 00:00:00'): -0.005961292490163306,
Timestamp('2010-01-06 00:00:00'): 0.008681861089003373,
Timestamp('2010-01-07 00:00:00'): -0.0002468649409474999},
'AAPL': {Timestamp('2009-12-31 00:00:00'): nan,
Timestamp('2010-01-04 00:00:00'): 0.015445029590123394,
Timestamp('2010-01-05 00:00:00'): 0.0017274020941329127,
Timestamp('2010-01-06 00:00:00'): -0.0160339066767059,
Timestamp('2010-01-07 00:00:00'): -0.001850310332141225}}
我的问题分为两部分:
- 我怎样才能得到 rt+1i ?
会不会是rt+1i = ln(AdjClosingPrice)t+1 - ln(AdjClosingPrice)t ?
- 你知道我应该计算它的循环吗?
将结果列移到顶部,今天会有明天的结果,如果那是您所说的。在本例中,'c' 列被移动。
将 pandas 导入为 pd
df = pd.DataFrame({'a': [1, 2, 3, 4, 5], 'b': [1, 2, 3, 4, 5], 'c': [1, 2, 3, 4, 5]})
print(df)
df.c = df.c.shift(-1)
print(df)
输出打印(df)
a b c
0 1 1 1
1 2 2 2
2 3 3 3
3 4 4 4
4 5 5 5
输出df.c = df.c.shift(-1)
a b c
0 1 1 2.0
1 2 2 3.0
2 3 3 4.0
3 4 4 5.0
4 5 5 NaN
根据上面的答案和各种组合的一些测试,我找到了问题的答案:
for i in df.columns:
df[i] = np.log(df[i].shift(-1)) - np.log(df[i])
我一直在做一个项目,我试图计算一段时间内以百分比表示的对数 returns。 我已经将所有每日调整后的收盘价存储在一个熊猫数据框中,如下所示:
{'SP500': {Timestamp('2009-12-31 00:00:00'): 1115.0999755859375,
Timestamp('2010-01-04 00:00:00'): 1132.989990234375,
Timestamp('2010-01-05 00:00:00'): 1136.52001953125,
Timestamp('2010-01-06 00:00:00'): 1137.1400146484375,
Timestamp('2010-01-07 00:00:00'): 1141.68994140625},
'A': {Timestamp('2009-12-31 00:00:00'): 20.28476333618164,
Timestamp('2010-01-04 00:00:00'): 20.43492889404297,
Timestamp('2010-01-05 00:00:00'): 20.21295928955078,
Timestamp('2010-01-06 00:00:00'): 20.141132354736328,
Timestamp('2010-01-07 00:00:00'): 20.11502456665039},
'AAL': {Timestamp('2009-12-31 00:00:00'): 4.562869548797607,
Timestamp('2010-01-04 00:00:00'): 4.496876239776611,
Timestamp('2010-01-05 00:00:00'): 5.005957126617432,
Timestamp('2010-01-06 00:00:00'): 4.79855489730835,
Timestamp('2010-01-07 00:00:00'): 4.93996524810791},
'AAP': {Timestamp('2009-12-31 00:00:00'): 38.3176383972168,
Timestamp('2010-01-04 00:00:00'): 38.22296905517578,
Timestamp('2010-01-05 00:00:00'): 37.99578857421875,
Timestamp('2010-01-06 00:00:00'): 38.32709884643555,
Timestamp('2010-01-07 00:00:00'): 38.3176383972168},
'AAPL': {Timestamp('2009-12-31 00:00:00'): 6.471692085266113,
Timestamp('2010-01-04 00:00:00'): 6.572423458099365,
Timestamp('2010-01-05 00:00:00'): 6.583786487579346,
Timestamp('2010-01-06 00:00:00'): 6.479064464569092,
Timestamp('2010-01-07 00:00:00'): 6.467087268829346}}
我认为每天的 return 定义为:在第 t 天,return 将是第 t 天的 log return 减去 log return 第 t-1 天。我应用了这行代码:
for i in df.columns:
df[i] = np.log(df[i]) - np.log(df[i].shift(1))
我已经检查过了,它给了我预期的结果是: rti = ln(AdjClosingPrice)t - ln(AdjClosingPrice)t -1 每列:
{'SP500': {Timestamp('2009-12-31 00:00:00'): nan,
Timestamp('2010-01-04 00:00:00'): 0.015916082167126255,
Timestamp('2010-01-05 00:00:00'): 0.003110831966759875,
Timestamp('2010-01-06 00:00:00'): 0.0005453718878234426,
Timestamp('2010-01-07 00:00:00'): 0.003993218354654715},
'A': {Timestamp('2009-12-31 00:00:00'): nan,
Timestamp('2010-01-04 00:00:00'): 0.007375607740701007,
Timestamp('2010-01-05 00:00:00'): -0.010921689703742743,
Timestamp('2010-01-06 00:00:00'): -0.003559837812704636,
Timestamp('2010-01-07 00:00:00'): -0.001297083166547086},
'AAL': {Timestamp('2009-12-31 00:00:00'): nan,
Timestamp('2010-01-04 00:00:00'): -0.014568725834338547,
Timestamp('2010-01-05 00:00:00'): 0.10724564178274565,
Timestamp('2010-01-06 00:00:00'): -0.042313819049169865,
Timestamp('2010-01-07 00:00:00'): 0.029043486854613887},
'AAP': {Timestamp('2009-12-31 00:00:00'): nan,
Timestamp('2010-01-04 00:00:00'): -0.0024737036578925675,
Timestamp('2010-01-05 00:00:00'): -0.005961292490163306,
Timestamp('2010-01-06 00:00:00'): 0.008681861089003373,
Timestamp('2010-01-07 00:00:00'): -0.0002468649409474999},
'AAPL': {Timestamp('2009-12-31 00:00:00'): nan,
Timestamp('2010-01-04 00:00:00'): 0.015445029590123394,
Timestamp('2010-01-05 00:00:00'): 0.0017274020941329127,
Timestamp('2010-01-06 00:00:00'): -0.0160339066767059,
Timestamp('2010-01-07 00:00:00'): -0.001850310332141225}}
我的问题分为两部分:
- 我怎样才能得到 rt+1i ?
会不会是rt+1i = ln(AdjClosingPrice)t+1 - ln(AdjClosingPrice)t ?
- 你知道我应该计算它的循环吗?
将结果列移到顶部,今天会有明天的结果,如果那是您所说的。在本例中,'c' 列被移动。 将 pandas 导入为 pd
df = pd.DataFrame({'a': [1, 2, 3, 4, 5], 'b': [1, 2, 3, 4, 5], 'c': [1, 2, 3, 4, 5]})
print(df)
df.c = df.c.shift(-1)
print(df)
输出打印(df)
a b c
0 1 1 1
1 2 2 2
2 3 3 3
3 4 4 4
4 5 5 5
输出df.c = df.c.shift(-1)
a b c
0 1 1 2.0
1 2 2 3.0
2 3 3 4.0
3 4 4 5.0
4 5 5 NaN
根据上面的答案和各种组合的一些测试,我找到了问题的答案:
for i in df.columns:
df[i] = np.log(df[i].shift(-1)) - np.log(df[i])