夏令时偏移 1 小时时重新索引不正确

Incorrect reindexing when Summer Time shifts by 1 hour

我正在尝试解决美国夏令时区发生的 1 小时时差问题。

这是时间序列的一部分(截图如下)

 In [3] eurusd
    Out[3]:
                         BID-CLOSE
    TIME                          
    1994-03-28 22:00:00    1.15981
    1994-03-29 22:00:00    1.16681
    1994-03-30 22:00:00    1.15021
    1994-03-31 22:00:00    1.14851
    1994-04-03 21:00:00    1.14081
    1994-04-04 21:00:00    1.13921
    1994-04-05 21:00:00    1.13881
    1994-04-06 21:00:00    1.14351
    1994-04-07 21:00:00    1.14411
    1994-04-10 21:00:00    1.14011
    1994-04-11 21:00:00    1.14391
    1994-04-12 21:00:00    1.14451
    1994-04-13 21:00:00    1.14201
    1994-04-14 21:00:00    1.13911
    1994-04-17 21:00:00    1.14821
    1994-04-18 21:00:00    1.15181
    1994-04-19 21:00:00    1.15621
    1994-04-20 21:00:00    1.15381
    1994-04-21 21:00:00    1.16201
    1994-04-24 21:00:00    1.16251
    1994-04-25 21:00:00    1.16721
    1994-04-26 21:00:00    1.17101
    1994-04-27 21:00:00    1.17721
    1994-04-28 21:00:00    1.18421
    1994-05-01 21:00:00    1.18751
    1994-05-02 21:00:00    1.17331
    1994-05-03 21:00:00    1.16801
    1994-05-04 21:00:00    1.17141
    1994-05-05 21:00:00    1.17691
    1994-05-08 21:00:00    1.16541
                           ...
    1994-09-26 21:00:00    1.25501
    1994-09-27 21:00:00    1.25761
    1994-09-28 21:00:00    1.25541
    1994-09-29 21:00:00    1.25421
    1994-10-02 21:00:00    1.25721
    1994-10-03 21:00:00    1.26131
    1994-10-04 21:00:00    1.26121
    1994-10-05 21:00:00    1.26101
    1994-10-06 21:00:00    1.25761
    1994-10-10 21:00:00    1.26161
    1994-10-11 21:00:00    1.26341
    1994-10-12 21:00:00    1.27821
    1994-10-13 21:00:00    1.29411
    1994-10-16 21:00:00    1.29401
    1994-10-17 21:00:00    1.29371
    1994-10-18 21:00:00    1.29531
    1994-10-19 21:00:00    1.29681
    1994-10-20 21:00:00    1.29971
    1994-10-23 21:00:00    1.30411
    1994-10-24 21:00:00    1.30311
    1994-10-25 21:00:00    1.30091
    1994-10-26 21:00:00    1.28921
    1994-10-27 21:00:00    1.29341
    1994-10-30 22:00:00    1.29931
    1994-10-31 22:00:00    1.29281
    1994-11-01 22:00:00    1.27771
    1994-11-02 22:00:00    1.27821
    1994-11-03 22:00:00    1.28321
    1994-11-06 22:00:00    1.28751
    1994-11-07 22:00:00    1.27091

目前,当我使用以下方法应用新的日期范围时:

idx = pd.date_range('1994-03-28 22:00:00', '1994-11-07 22:00:00', freq= 'D')

In [4] idx
Out[4]: 
DatetimeIndex(['1994-03-28 22:00:00', '1994-03-29 22:00:00',
               '1994-03-30 22:00:00', '1994-03-31 22:00:00',
               '1994-04-01 22:00:00', '1994-04-02 22:00:00',
               '1994-04-03 22:00:00', '1994-04-04 22:00:00',
               '1994-04-05 22:00:00', '1994-04-06 22:00:00',
               ...
               '1994-10-29 22:00:00', '1994-10-30 22:00:00',
               '1994-10-31 22:00:00', '1994-11-01 22:00:00',
               '1994-11-02 22:00:00', '1994-11-03 22:00:00',
               '1994-11-04 22:00:00', '1994-11-05 22:00:00',
               '1994-11-06 22:00:00', '1994-11-07 22:00:00'],
              dtype='datetime64[ns]', length=225, freq='D')

然后,我使用新的日期范围重新索引数据框,时间序列将所有 21:00 值转换为 22:00,BID-CLOSE 变为 NaN。我明白为什么,但是我不确定如何根据美国夏令时时间表让代码知道 1 小时的时间步长。

重建索引的输出:

In[5]: eurusd_copy1 = eurusd.reindex(idx, fill_value=None)

In[6]: eurusd_copy1
Out[6]: 
                     BID-CLOSE
1994-03-28 22:00:00    1.15981
1994-03-29 22:00:00    1.16681
1994-03-30 22:00:00    1.15021
1994-03-31 22:00:00    1.14851
1994-04-01 22:00:00        NaN
1994-04-02 22:00:00        NaN
1994-04-03 22:00:00        NaN
1994-04-04 22:00:00        NaN
1994-04-05 22:00:00        NaN
1994-04-06 22:00:00        NaN
1994-04-07 22:00:00        NaN
1994-04-08 22:00:00        NaN
1994-04-09 22:00:00        NaN
1994-04-10 22:00:00        NaN
1994-04-11 22:00:00        NaN
1994-04-12 22:00:00        NaN
1994-04-13 22:00:00        NaN
1994-04-14 22:00:00        NaN
1994-04-15 22:00:00        NaN
1994-04-16 22:00:00        NaN
1994-04-17 22:00:00        NaN
1994-04-18 22:00:00        NaN
1994-04-19 22:00:00        NaN
1994-04-20 22:00:00        NaN
1994-04-21 22:00:00        NaN
1994-04-22 22:00:00        NaN
1994-04-23 22:00:00        NaN
1994-04-24 22:00:00        NaN
1994-04-25 22:00:00        NaN
1994-04-26 22:00:00        NaN
                       ...
1994-10-09 22:00:00        NaN
1994-10-10 22:00:00        NaN
1994-10-11 22:00:00        NaN
1994-10-12 22:00:00        NaN
1994-10-13 22:00:00        NaN
1994-10-14 22:00:00        NaN
1994-10-15 22:00:00        NaN
1994-10-16 22:00:00        NaN
1994-10-17 22:00:00        NaN
1994-10-18 22:00:00        NaN
1994-10-19 22:00:00        NaN
1994-10-20 22:00:00        NaN
1994-10-21 22:00:00        NaN
1994-10-22 22:00:00        NaN
1994-10-23 22:00:00        NaN
1994-10-24 22:00:00        NaN
1994-10-25 22:00:00        NaN
1994-10-26 22:00:00        NaN
1994-10-27 22:00:00        NaN
1994-10-28 22:00:00        NaN
1994-10-29 22:00:00        NaN
1994-10-30 22:00:00    1.29931
1994-10-31 22:00:00    1.29281
1994-11-01 22:00:00    1.27771
1994-11-02 22:00:00    1.27821
1994-11-03 22:00:00    1.28321
1994-11-04 22:00:00        NaN
1994-11-05 22:00:00        NaN
1994-11-06 22:00:00    1.28751
1994-11-07 22:00:00    1.27091

[225 rows x 1 columns]

所需的输出将用 NaN 填充任何日期间隙,但是保持已经有日期的 BID-CLOSE 值未更改。请注意下面的输出是虚构的,只是为了说明所需的结果。

                       BID-CLOSE
28/03/1994 22:00:00 1.15981
29/03/1994 22:00:00 1.16681
30/03/1994 22:00:00 1.15021
31/03/1994 22:00:00 1.14851
01/04/1994 21:00:00     NaN
02/04/1994 21:00:00     NaN
03/04/1994 21:00:00 1.13881
04/04/1994 21:00:00 1.14351
05/04/1994 21:00:00 1.14411
06/04/1994 21:00:00 1.14011
07/04/1994 21:00:00 1.14391
08/04/1994 21:00:00     NaN
09/04/1994 21:00:00     NaN
10/04/1994 21:00:00 1.14451
11/04/1994 21:00:00 1.14201
12/04/1994 21:00:00 1.13911
13/04/1994 21:00:00 1.14821
        …    
25/10/1994 21:00:00 1.29371
26/10/1994 21:00:00     NaN
27/10/1994 21:00:00 1.29681
28/10/1994 21:00:00 1.29971
29/10/1994 21:00:00 1.30411
30/10/1994 22:00:00 1.30311
31/10/1994 22:00:00     NaN
01/11/1994 22:00:00     NaN
02/11/1994 22:00:00 1.29341

如何让代码知道美国时区?

我猜你的日期索引是时区天真。

首先设置时区,我假设它们是UTC

eurusd = eurusd.tz_localize('UTC')

然后您可以将它们转换为您喜欢的任何时区

eurusd = eurusd.tz_convert('America/New_York')

然后您可以根据需要重新编制索引