手动执行时没有得到与 pct_change 相同的结果
Not getting same results as pct_change when doing manually
我的代码如下:
import pandas as pd
from pandas_datareader import data as web
import datetime
start = datetime.datetime(2021, 1, 1)
end = datetime.datetime.today()
df = web.DataReader('goog', 'yahoo', start, end)
df['pct']= df['Close'].pct_change()
足够简单,产生:
High Low Open Close Volume Adj pct
Date
2020-12-31 1758.930054 1735.420044 1735.420044 1751.880005 1011900 1751.880005 NaN
2021-01-04 1760.650024 1707.849976 1757.540039 1728.239990 1901900 1728.239990 -0.013494
2021-01-05 1747.670044 1718.015015 1725.000000 1740.920044 1145300 1740.920044 0.007337
2021-01-06 1748.000000 1699.000000 1702.630005 1735.290039 2602100 1735.290039 -0.003234
2021-01-07 1788.400024 1737.050049 1740.060059 1787.250000 2265000 1787.250000 0.029943
... ... ... ... ... ... ... ...
2021-08-13 2773.479980 2760.100098 2767.149902 2768.120117 628600 2768.120117 0.000119
2021-08-16 2779.810059 2723.314941 2760.000000 2778.320068 902000 2778.320068 0.003685
2021-08-17 2774.370117 2735.750000 2763.820068 2746.010010 1063600 2746.010010 -0.011629
2021-08-18 2765.879883 2728.419922 2742.310059 2731.399902 746700 2731.399902 -0.005320
2021-08-19 2748.925049 2707.120117 2709.350098 2738.270020 856623 2738.270020 0.002515
160 rows × 7 columns
所以最后一行说 pct 是 0.002515
我的反对是在没有 pct_change
的情况下重现相同的结果 我有这个代码
(1- (df['Close'] / df['Close'].shift(-1))).shift(1)
产生这个:
Date
2020-12-31 NaN
2021-01-04 -0.013679
2021-01-05 0.007284
2021-01-06 -0.003244
2021-01-07 0.029073
...
2021-08-13 0.000119
2021-08-16 0.003671
2021-08-17 -0.011766
2021-08-18 -0.005349
2021-08-19 0.002509
Name: Close, Length: 160, dtype: float64
我得到的最后一个值是 0.002509
而不是 0.002515
。你能解释一下为什么我在每次计算时都少了最后 2 位数字吗?
百分比变化通常是相对于初始值的变化:
(final - initial) / initial = final / initial - 1
你有相对于最终值的比率。尝试
df['Close'].shift(1) / df['Close'] - 1
顺便说一下,你只需要在你的原始表达式中移动一次。
我的代码如下:
import pandas as pd
from pandas_datareader import data as web
import datetime
start = datetime.datetime(2021, 1, 1)
end = datetime.datetime.today()
df = web.DataReader('goog', 'yahoo', start, end)
df['pct']= df['Close'].pct_change()
足够简单,产生:
High Low Open Close Volume Adj pct
Date
2020-12-31 1758.930054 1735.420044 1735.420044 1751.880005 1011900 1751.880005 NaN
2021-01-04 1760.650024 1707.849976 1757.540039 1728.239990 1901900 1728.239990 -0.013494
2021-01-05 1747.670044 1718.015015 1725.000000 1740.920044 1145300 1740.920044 0.007337
2021-01-06 1748.000000 1699.000000 1702.630005 1735.290039 2602100 1735.290039 -0.003234
2021-01-07 1788.400024 1737.050049 1740.060059 1787.250000 2265000 1787.250000 0.029943
... ... ... ... ... ... ... ...
2021-08-13 2773.479980 2760.100098 2767.149902 2768.120117 628600 2768.120117 0.000119
2021-08-16 2779.810059 2723.314941 2760.000000 2778.320068 902000 2778.320068 0.003685
2021-08-17 2774.370117 2735.750000 2763.820068 2746.010010 1063600 2746.010010 -0.011629
2021-08-18 2765.879883 2728.419922 2742.310059 2731.399902 746700 2731.399902 -0.005320
2021-08-19 2748.925049 2707.120117 2709.350098 2738.270020 856623 2738.270020 0.002515
160 rows × 7 columns
所以最后一行说 pct 是 0.002515
我的反对是在没有 pct_change
的情况下重现相同的结果 我有这个代码
(1- (df['Close'] / df['Close'].shift(-1))).shift(1)
产生这个:
Date
2020-12-31 NaN
2021-01-04 -0.013679
2021-01-05 0.007284
2021-01-06 -0.003244
2021-01-07 0.029073
...
2021-08-13 0.000119
2021-08-16 0.003671
2021-08-17 -0.011766
2021-08-18 -0.005349
2021-08-19 0.002509
Name: Close, Length: 160, dtype: float64
我得到的最后一个值是 0.002509
而不是 0.002515
。你能解释一下为什么我在每次计算时都少了最后 2 位数字吗?
百分比变化通常是相对于初始值的变化:
(final - initial) / initial = final / initial - 1
你有相对于最终值的比率。尝试
df['Close'].shift(1) / df['Close'] - 1
顺便说一下,你只需要在你的原始表达式中移动一次。