Python pandas: "Empty Dataframe" 选择该 Dataframe 的间隔时

Question

我正在尝试 select Dataframe 中两个日期之间的一些行。问题是当我尝试时，我得到：

Empty DataFrame

我导入了一些财务历史数据，然后将日期列作为索引（DatetimeIndex）。

当我尝试单独 select 一行带有日期时，它起作用了。当我尝试使用它没有的日期间隔时（即使我单独检查每一行）。

我尝试用 fillna() 填充可能的空单元格，但没有成功。

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

from datetime import datetime

#Open Euro Euro Stoxx 50 csv file, rename columns and set dates as index

euro_stoxx_50 = pd.read_csv('STOXX50E.csv', parse_dates = True, index_col = 0)
euro_stoxx_50.columns = ['open', 'high', 'low', 'close', 'volume', 'adj close']
euro_stoxx_50.index.names = ['date']

我的示例问题：

print euro_stoxx_50.head() 
print euro_stoxx_50.index
print euro_stoxx_50.empty
print euro_stoxx_50['2012':'2015'].empty

将给予：

date         open     high      low    close    volume  adj close                                              
2015-09-25  3113.16  3113.16  3113.16  3113.16       0    3113.16
2015-09-24  3019.34  3019.34  3019.34  3019.34       0    3019.34
2015-09-23  3079.99  3079.99  3079.99  3079.99       0    3079.99
2015-09-22  3076.05  3076.05  3076.05  3076.05       0    3076.05
2015-09-21  3184.72  3184.72  3184.72  3184.72       0    3184.72

<class 'pandas.tseries.index.DatetimeIndex'>
[2015-09-25, ..., 1986-12-31]
Length: 7396, Freq: None, Timezone: None

False

True

和

print euro_stoxx_50['2012-9-12']
print euro_stoxx_50['2012-9-13']
print euro_stoxx_50['2012-9-12':'2012-9-13']

将给予：

date        open    high     low   close  volume  adj close                                                        
2012-09-12  2564.8  2564.8  2564.8  2564.8       0     2564.8


date   open     high      low    close  volume  adj close                                                          
2012-09-13  2543.22  2543.22  2543.22  2543.22       0    2543.22

Empty DataFrame
Columns: [open, high, low, close, volume, adj close]
Index: []

编辑

感谢您的帮助！

Answer 1

如果我的理解正确的话，您想过滤日期介于两点之间的行。如果是这样，你可以这样做。

first = pd.to_datetime('2012-1-1')
last = pd.to_datetime('2015-1-1')

df[(df['date'] > first) & (df['date'] < last)]

编辑：由于 'date' 是您可以使用 loc:

的索引

df.loc[first:last]

Answer 2

我发现当使用日期列系列为 DataFrame 编制索引时，使用日期时间字符串的 ix 索引有效。例如，给定 test.txt

中的以下数据

date        open     high     low      close    volume    adj
2015-09-25  3113.16  3113.16  3113.16  3113.16       0    3113.16
2015-09-24  3019.34  3019.34  3019.34  3019.34       0    3019.34
2015-09-23  3079.99  3079.99  3079.99  3079.99       0    3079.99
2015-09-22  3076.05  3076.05  3076.05  3076.05       0    3076.05
2015-09-21  3184.72  3184.72  3184.72  3184.72       0    3184.72

import pandas as pd

df = pd.read_csv('test.txt', sep="\s+")
df['date'] = pd.to_datetime(df['date'])
df.set_index(['date',inplace=True])
df.ix['2015-09-25':'2015-09-22']
Out[15]: 
               open     high      low    close  volume      adj
date                                                           
2015-09-25  3113.16  3113.16  3113.16  3113.16       0  3113.16
2015-09-24  3019.34  3019.34  3019.34  3019.34       0  3019.34
2015-09-23  3079.99  3079.99  3079.99  3079.99       0  3079.99

Python pandas: "Empty Dataframe" 选择该 Dataframe 的间隔时

Python pandas: "Empty Dataframe" when selecting an interval of that Dataframe

python

intervals

pandas