Python Pandas - 从 csv 文件创建时间序列
Python Pandas - Creating a timeseries from a csv file
我想通过读取 csv 文件并将第一列设置为 DatetimeIndex 来创建时间序列 'aapl'。
这里是 csv 文件中的一些行:
2000-01-03, 111.937502
2000-01-04, 102.500003
2000-01-05, 103.999997
2000-01-06, 94.999998
2000-01-07, 99.500001
结果显示如下:
In [1]: aapl.head()
Out[1]:
Date
2000-01-03 111.937502
2000-01-04 102.500003
2000-01-05 103.999997
2000-01-06 94.999998
2000-01-07 99.500001
Name: AAPL, dtype: float64
In [2]: type(aapl)
Out[2]: pandas.core.series.Series
In [3]: type(aapl.index)
Out[3]: pandas.tseries.index.DatetimeIndex
我试过了:
aapl = pd.read_csv('aapl.csv', header=None)
aapl[0] = pd.to_datetime(aapl[0])
aapl.set_index(0, inplace=True)
aapl.index.name = 'Date'
print(type(aapl))
print(type(aapl.index))
print(aapl.head())
但这给我留下了:
<class 'pandas.core.frame.DataFrame'>
<class 'pandas.core.indexes.datetimes.DatetimeIndex'>
1
Date
2000-01-03 111.937502
2000-01-04 102.500003
2000-01-05 103.999997
2000-01-06 94.999998
2000-01-07 99.500001
它仍然是一个数据框,而不是一个系列。具有值的列仍然有一个列名。
欢迎所有建议!
我认为你可以使用参数 squeeze
转换为 Series
主要是:
import pandas as pd
from pandas.compat import StringIO
temp=u"""2000-01-03,111.937502
2000-01-04,102.500003
2000-01-05,103.999997
2000-01-06,94.999998
2000-01-07,99.500001"""
#after testing replace 'StringIO(temp)' to 'filename.csv'
aapl = pd.read_csv(StringIO(temp),
squeeze=True,
index_col=[0],
parse_dates=True,
names=['Date','col'])
print(type(aapl))
<class 'pandas.core.series.Series'>
print(type(aapl.index))
<class 'pandas.core.indexes.datetimes.DatetimeIndex'>
print(aapl.head())
Date
2000-01-03 111.937502
2000-01-04 102.500003
2000-01-05 103.999997
2000-01-06 94.999998
2000-01-07 99.500001
Name: col, dtype: float64
我想通过读取 csv 文件并将第一列设置为 DatetimeIndex 来创建时间序列 'aapl'。
这里是 csv 文件中的一些行:
2000-01-03, 111.937502
2000-01-04, 102.500003
2000-01-05, 103.999997
2000-01-06, 94.999998
2000-01-07, 99.500001
结果显示如下:
In [1]: aapl.head()
Out[1]:
Date
2000-01-03 111.937502
2000-01-04 102.500003
2000-01-05 103.999997
2000-01-06 94.999998
2000-01-07 99.500001
Name: AAPL, dtype: float64
In [2]: type(aapl)
Out[2]: pandas.core.series.Series
In [3]: type(aapl.index)
Out[3]: pandas.tseries.index.DatetimeIndex
我试过了:
aapl = pd.read_csv('aapl.csv', header=None)
aapl[0] = pd.to_datetime(aapl[0])
aapl.set_index(0, inplace=True)
aapl.index.name = 'Date'
print(type(aapl))
print(type(aapl.index))
print(aapl.head())
但这给我留下了:
<class 'pandas.core.frame.DataFrame'>
<class 'pandas.core.indexes.datetimes.DatetimeIndex'>
1
Date
2000-01-03 111.937502
2000-01-04 102.500003
2000-01-05 103.999997
2000-01-06 94.999998
2000-01-07 99.500001
它仍然是一个数据框,而不是一个系列。具有值的列仍然有一个列名。
欢迎所有建议!
我认为你可以使用参数 squeeze
转换为 Series
主要是:
import pandas as pd
from pandas.compat import StringIO
temp=u"""2000-01-03,111.937502
2000-01-04,102.500003
2000-01-05,103.999997
2000-01-06,94.999998
2000-01-07,99.500001"""
#after testing replace 'StringIO(temp)' to 'filename.csv'
aapl = pd.read_csv(StringIO(temp),
squeeze=True,
index_col=[0],
parse_dates=True,
names=['Date','col'])
print(type(aapl))
<class 'pandas.core.series.Series'>
print(type(aapl.index))
<class 'pandas.core.indexes.datetimes.DatetimeIndex'>
print(aapl.head())
Date
2000-01-03 111.937502
2000-01-04 102.500003
2000-01-05 103.999997
2000-01-06 94.999998
2000-01-07 99.500001
Name: col, dtype: float64