在 Pandas 数据框中填充货币缺失数据
Filling currency missing data in Pandas Dataframe
我正在从 Quandl 中提取货币数据:
import Quandl
df = Quandl.get("CURRFX/USDEUR", collapse="daily")
df = df.tail(10).reset_index()
运行上面的代码,可以看到周末当然没有货币数据了
数据框不显示缺失的天数行。
知道如何添加缺失的天数并用前一周的平均值填充它们吗?
更新:
我的代码是:
exchange = Quandl.get("CURRFX/USDEUR", trim_start="2016-02-01",trim_end="2016-02-28")
exchange = exchange.drop(['High (est)','Low (est)'], axis = 1)
exchange = exchange.reindex(pd.date_range(start=exchange.index[0], end=exchange.index[-1])).ffill().reset_index()
exchange = exchange.rename(columns={'index':'Day','Rate':'exchange'})
当然,想象一下我 运行 星期一早上,我仍然没有整个日期的数据,所以星期天和星期六保持空...
如何填写?
IIUC 然后你可以 reindex
and pass a new DatetimeIndex
generated from date_range
并调用 ffill
:
In [115]:
df.reindex(pd.date_range(start=df.index[0], end=df.index[-1])).ffill()
Out[115]:
Rate High (est) Low (est)
1999-09-06 0.941019 0.95269 0.92949
1999-09-07 0.945500 0.95636 0.93476
1999-09-08 0.944376 0.95588 0.93301
1999-09-09 0.943697 0.95412 0.93339
1999-09-10 0.948008 0.00000 0.00000
1999-09-11 0.948008 0.00000 0.00000
1999-09-12 0.948008 0.00000 0.00000
1999-09-13 0.959838 0.97143 0.94838
1999-09-14 0.964873 0.97623 0.95365
1999-09-15 0.965378 0.97708 0.95382
1999-09-16 0.963142 0.97393 0.95248
1999-09-17 0.961971 0.00000 0.00000
1999-09-18 0.961971 0.00000 0.00000
1999-09-19 0.961971 0.00000 0.00000
1999-09-20 0.960917 0.00000 0.00000
1999-09-21 0.959907 0.97166 0.94829
1999-09-22 0.954640 0.96633 0.94309
1999-09-23 0.955403 0.96649 0.94444
1999-09-24 0.955754 0.00000 0.00000
1999-09-25 0.955754 0.00000 0.00000
1999-09-26 0.955754 0.00000 0.00000
1999-09-27 0.956281 0.96693 0.94575
1999-09-28 0.955362 0.00000 0.00000
1999-09-29 0.949949 0.96176 0.93829
1999-09-30 0.942264 0.00000 0.00000
1999-10-01 0.934798 0.00000 0.00000
1999-10-02 0.934798 0.00000 0.00000
1999-10-03 0.934798 0.00000 0.00000
1999-10-04 0.932772 0.00000 0.00000
1999-10-05 0.934479 0.94584 0.92326
... ... ... ...
2016-01-17 0.917878 0.00000 0.00000
2016-01-18 0.917451 0.92788 0.90714
2016-01-19 0.918046 0.92896 0.90726
2016-01-20 0.916549 0.92688 0.90634
2016-01-21 0.918055 0.92901 0.90723
2016-01-22 0.921764 0.00000 0.00000
2016-01-23 0.921764 0.00000 0.00000
2016-01-24 0.921764 0.00000 0.00000
2016-01-25 0.923926 0.00000 0.00000
2016-01-26 0.922914 0.93427 0.91170
2016-01-27 0.920722 0.93114 0.91042
2016-01-28 0.917792 0.92875 0.90696
2016-01-29 0.917405 0.00000 0.00000
2016-01-30 0.917405 0.00000 0.00000
2016-01-31 0.917405 0.00000 0.00000
2016-02-01 0.919592 0.00000 0.00000
2016-02-02 0.918263 0.92859 0.90805
2016-02-03 0.913820 0.92514 0.90264
2016-02-04 0.903148 0.91308 0.89332
2016-02-05 0.895662 0.90574 0.88569
2016-02-06 0.895662 0.90574 0.88569
2016-02-07 0.895662 0.90574 0.88569
2016-02-08 0.896030 0.00000 0.00000
2016-02-09 0.892736 0.90329 0.88231
2016-02-10 0.888489 0.89846 0.87863
2016-02-11 0.885288 0.89508 0.87560
2016-02-12 0.885464 0.00000 0.00000
2016-02-13 0.885464 0.00000 0.00000
2016-02-14 0.885464 0.00000 0.00000
2016-02-15 0.890435 0.00000 0.00000
[6007 rows x 3 columns]
我正在从 Quandl 中提取货币数据:
import Quandl
df = Quandl.get("CURRFX/USDEUR", collapse="daily")
df = df.tail(10).reset_index()
运行上面的代码,可以看到周末当然没有货币数据了
数据框不显示缺失的天数行。
知道如何添加缺失的天数并用前一周的平均值填充它们吗?
更新:
我的代码是:
exchange = Quandl.get("CURRFX/USDEUR", trim_start="2016-02-01",trim_end="2016-02-28")
exchange = exchange.drop(['High (est)','Low (est)'], axis = 1)
exchange = exchange.reindex(pd.date_range(start=exchange.index[0], end=exchange.index[-1])).ffill().reset_index()
exchange = exchange.rename(columns={'index':'Day','Rate':'exchange'})
当然,想象一下我 运行 星期一早上,我仍然没有整个日期的数据,所以星期天和星期六保持空...
如何填写?
IIUC 然后你可以 reindex
and pass a new DatetimeIndex
generated from date_range
并调用 ffill
:
In [115]:
df.reindex(pd.date_range(start=df.index[0], end=df.index[-1])).ffill()
Out[115]:
Rate High (est) Low (est)
1999-09-06 0.941019 0.95269 0.92949
1999-09-07 0.945500 0.95636 0.93476
1999-09-08 0.944376 0.95588 0.93301
1999-09-09 0.943697 0.95412 0.93339
1999-09-10 0.948008 0.00000 0.00000
1999-09-11 0.948008 0.00000 0.00000
1999-09-12 0.948008 0.00000 0.00000
1999-09-13 0.959838 0.97143 0.94838
1999-09-14 0.964873 0.97623 0.95365
1999-09-15 0.965378 0.97708 0.95382
1999-09-16 0.963142 0.97393 0.95248
1999-09-17 0.961971 0.00000 0.00000
1999-09-18 0.961971 0.00000 0.00000
1999-09-19 0.961971 0.00000 0.00000
1999-09-20 0.960917 0.00000 0.00000
1999-09-21 0.959907 0.97166 0.94829
1999-09-22 0.954640 0.96633 0.94309
1999-09-23 0.955403 0.96649 0.94444
1999-09-24 0.955754 0.00000 0.00000
1999-09-25 0.955754 0.00000 0.00000
1999-09-26 0.955754 0.00000 0.00000
1999-09-27 0.956281 0.96693 0.94575
1999-09-28 0.955362 0.00000 0.00000
1999-09-29 0.949949 0.96176 0.93829
1999-09-30 0.942264 0.00000 0.00000
1999-10-01 0.934798 0.00000 0.00000
1999-10-02 0.934798 0.00000 0.00000
1999-10-03 0.934798 0.00000 0.00000
1999-10-04 0.932772 0.00000 0.00000
1999-10-05 0.934479 0.94584 0.92326
... ... ... ...
2016-01-17 0.917878 0.00000 0.00000
2016-01-18 0.917451 0.92788 0.90714
2016-01-19 0.918046 0.92896 0.90726
2016-01-20 0.916549 0.92688 0.90634
2016-01-21 0.918055 0.92901 0.90723
2016-01-22 0.921764 0.00000 0.00000
2016-01-23 0.921764 0.00000 0.00000
2016-01-24 0.921764 0.00000 0.00000
2016-01-25 0.923926 0.00000 0.00000
2016-01-26 0.922914 0.93427 0.91170
2016-01-27 0.920722 0.93114 0.91042
2016-01-28 0.917792 0.92875 0.90696
2016-01-29 0.917405 0.00000 0.00000
2016-01-30 0.917405 0.00000 0.00000
2016-01-31 0.917405 0.00000 0.00000
2016-02-01 0.919592 0.00000 0.00000
2016-02-02 0.918263 0.92859 0.90805
2016-02-03 0.913820 0.92514 0.90264
2016-02-04 0.903148 0.91308 0.89332
2016-02-05 0.895662 0.90574 0.88569
2016-02-06 0.895662 0.90574 0.88569
2016-02-07 0.895662 0.90574 0.88569
2016-02-08 0.896030 0.00000 0.00000
2016-02-09 0.892736 0.90329 0.88231
2016-02-10 0.888489 0.89846 0.87863
2016-02-11 0.885288 0.89508 0.87560
2016-02-12 0.885464 0.00000 0.00000
2016-02-13 0.885464 0.00000 0.00000
2016-02-14 0.885464 0.00000 0.00000
2016-02-15 0.890435 0.00000 0.00000
[6007 rows x 3 columns]