通过索引日期时间切片 pandas 数据框
Slicing pandas dataframe via index datetime
我尝试对 pandas 数据帧进行切片,该数据帧是从 CSV 文件中读取的,索引是从日期的第一列设置的。
输入:
df = pd.read_csv(r'E:\...\^d.csv')
df["Date"] = pd.to_datetime(df["Date"])
输出:
Date Open High Low Close Volume
0 1920-01-02 9.52 9.52 9.52 9.52 NaN
1 1920-01-03 9.62 9.62 9.62 9.62 NaN
2 1920-01-05 9.57 9.57 9.57 9.57 NaN
3 1920-01-06 9.46 9.46 9.46 9.46 NaN
4 1920-01-07 9.47 9.47 9.47 9.47 NaN
Date Open High Low Close Volume
26798 2020-10-26 3441.42 3441.42 3364.86 3400.97 2.435787e+09
26799 2020-10-27 3403.15 3409.51 3388.71 3390.68 2.395102e+09
26800 2020-10-28 3342.48 3342.48 3268.89 3271.03 3.147944e+09
26801 2020-10-29 3277.17 3341.05 3259.82 3310.11 2.752626e+09
26802 2020-10-30 3293.59 3304.93 3233.94 3269.96 3.002804e+09
输入:
df = df.set_index(['Date'])
print("my index type is ")
print(df.index.dtype)
print(type(df.index)) #type of index
输出:
Open High Low Close Volume
Date
2007-01-03 1418.03 1429.42 1407.86 1416.60 1.905089e+09
2007-01-04 1416.95 1421.84 1408.22 1418.34 1.669144e+09
2007-01-05 1418.34 1418.34 1405.75 1409.71 1.621889e+09
2007-01-08 1409.22 1414.98 1403.97 1412.84 1.535189e+09
2007-01-09 1412.85 1415.61 1405.42 1412.11 1.687989e+09
... ... ... ... ...
2009-12-24 1120.59 1126.48 1120.59 1126.48 7.042833e+08
2009-12-28 1126.48 1130.38 1123.51 1127.78 1.509111e+09
2009-12-29 1127.78 1130.38 1126.08 1126.19 1.383900e+09
2009-12-30 1126.19 1126.42 1121.94 1126.42 1.265167e+09
2009-12-31 1126.42 1127.64 1114.81 1115.10 1.153883e+09
my index type is
datetime64[ns]
<class 'pandas.core.indexes.datetimes.DatetimeIndex'>
我尝试使用
对星期一进行切片
monday_dow = df["Date"].dt.dayofweek==0
OUT(Spyder returns):
KeyError: 'Date'
我在 Whosebug 上阅读了很多类似的答案,但可以解决这个问题,虽然我知道我对索引做错了,但应该用另一种方式调用它吗?
您需要按 DatetimeIndex
按 DatetimeIndex.dayofweek
进行过滤(已删除仅用于列的 .dt
):
monday_dow = df.index.dayofweek==0
因此,如果需要所有行:
df1 = df[monday_dow]
这里也可以简化 set DatimeIndex
in read_csv
:
的代码
df = pd.read_csv(r'E:\...\^d.csv', index_col=['Date'], parse_dates=['Date'])
monday_dow = df.index.dayofweek==0
df1 = df[monday_dow]
我尝试对 pandas 数据帧进行切片,该数据帧是从 CSV 文件中读取的,索引是从日期的第一列设置的。
输入:
df = pd.read_csv(r'E:\...\^d.csv')
df["Date"] = pd.to_datetime(df["Date"])
输出:
Date Open High Low Close Volume
0 1920-01-02 9.52 9.52 9.52 9.52 NaN
1 1920-01-03 9.62 9.62 9.62 9.62 NaN
2 1920-01-05 9.57 9.57 9.57 9.57 NaN
3 1920-01-06 9.46 9.46 9.46 9.46 NaN
4 1920-01-07 9.47 9.47 9.47 9.47 NaN
Date Open High Low Close Volume
26798 2020-10-26 3441.42 3441.42 3364.86 3400.97 2.435787e+09
26799 2020-10-27 3403.15 3409.51 3388.71 3390.68 2.395102e+09
26800 2020-10-28 3342.48 3342.48 3268.89 3271.03 3.147944e+09
26801 2020-10-29 3277.17 3341.05 3259.82 3310.11 2.752626e+09
26802 2020-10-30 3293.59 3304.93 3233.94 3269.96 3.002804e+09
输入:
df = df.set_index(['Date'])
print("my index type is ")
print(df.index.dtype)
print(type(df.index)) #type of index
输出:
Open High Low Close Volume
Date
2007-01-03 1418.03 1429.42 1407.86 1416.60 1.905089e+09
2007-01-04 1416.95 1421.84 1408.22 1418.34 1.669144e+09
2007-01-05 1418.34 1418.34 1405.75 1409.71 1.621889e+09
2007-01-08 1409.22 1414.98 1403.97 1412.84 1.535189e+09
2007-01-09 1412.85 1415.61 1405.42 1412.11 1.687989e+09
... ... ... ... ...
2009-12-24 1120.59 1126.48 1120.59 1126.48 7.042833e+08
2009-12-28 1126.48 1130.38 1123.51 1127.78 1.509111e+09
2009-12-29 1127.78 1130.38 1126.08 1126.19 1.383900e+09
2009-12-30 1126.19 1126.42 1121.94 1126.42 1.265167e+09
2009-12-31 1126.42 1127.64 1114.81 1115.10 1.153883e+09
my index type is
datetime64[ns]
<class 'pandas.core.indexes.datetimes.DatetimeIndex'>
我尝试使用
对星期一进行切片monday_dow = df["Date"].dt.dayofweek==0
OUT(Spyder returns):
KeyError: 'Date'
我在 Whosebug 上阅读了很多类似的答案,但可以解决这个问题,虽然我知道我对索引做错了,但应该用另一种方式调用它吗?
您需要按 DatetimeIndex
按 DatetimeIndex.dayofweek
进行过滤(已删除仅用于列的 .dt
):
monday_dow = df.index.dayofweek==0
因此,如果需要所有行:
df1 = df[monday_dow]
这里也可以简化 set DatimeIndex
in read_csv
:
df = pd.read_csv(r'E:\...\^d.csv', index_col=['Date'], parse_dates=['Date'])
monday_dow = df.index.dayofweek==0
df1 = df[monday_dow]