如何将 OHLCV 数据重新采样为 5 分钟?
How to resample OHLCV data into 5-min?
我有这组数据
2016-08-09 12:39:00,536.7841,536.7849,536.6141,536.7849,0.656
2016-08-09 12:40:00,536.6749,536.6749,536.6749,536.6749,0.2642
2016-08-09 12:41:00,535.84,535.84,535.615,535.615,0.348
2016-08-09 12:42:00,535.5401,535.5401,534.1801,534.1801,0.507
2016-08-09 12:43:00,534.5891,534.8753,534.5891,534.807,0.656
2016-08-09 12:44:00,534.8014,534.878,534.8014,534.8416,0.502
2016-08-09 12:45:00,534.8131,534.8131,534.2303,534.6736,0.552
2016-08-09 12:47:00,534.756,538.5999,534.756,534.7836,0.62647241
2016-08-09 12:48:00,536.0557,536.6864,536.0557,536.6864,1.2614
2016-08-09 12:49:00,536.8966,537.7289,536.8966,537.7289,0.532
2016-08-09 12:50:00,537.9829,539.2199,537.9829,539.2199,0.67752932
2016-08-09 12:51:00,538.5,539.2199,538.5,539.2199,0.43768953
我想将它重新采样到 5 分钟的 OHCLV,所以我做了这个代码:
import pandas as pd
df= pd.read_csv("C:\Users\Araujo's PC\Desktop\python_scripts\CSV\cex_btc.csv",
names=['timestamps','open','high','low','close','volume'])
df.set_index('timestamps',inplace=True)
ohlc_dict = {
'open':'first',
'high':'max',
'low':'min',
'close':'last',
'volume':'sum'
}
df.resample('5T', how=ohlc_dict)
print df
我看来是这个错误:
TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'Index'
有人可以帮我解决这个问题吗?
您只需将 timestamps
列中的值转换为 pandas 时间戳,然后再使用它们的值设置索引。我相信它们目前只是文本字段。
df['timestamps'] = pd.to_datetime(df['timestamps'])
df.set_index('timestamps', inplace=True)
>>> df.resample('5T', how=ohlc_dict)
high close open low volume
timestamps
2016-08-09 12:35:00 536.7849 536.7849 536.7841 536.6141 0.656000
2016-08-09 12:40:00 536.6749 534.8416 536.6749 534.1801 2.277200
2016-08-09 12:45:00 538.5999 537.7289 534.8131 534.2303 2.971872
2016-08-09 12:50:00 539.2199 539.2199 537.9829 537.9829 1.115219
您也可以尝试在读取 csv 时解析这些:
pd.read_csv(filename, parse_dates=['timestamps'],
names=['timestamps','open','high','low','close','volume'])
我有这组数据
2016-08-09 12:39:00,536.7841,536.7849,536.6141,536.7849,0.656
2016-08-09 12:40:00,536.6749,536.6749,536.6749,536.6749,0.2642
2016-08-09 12:41:00,535.84,535.84,535.615,535.615,0.348
2016-08-09 12:42:00,535.5401,535.5401,534.1801,534.1801,0.507
2016-08-09 12:43:00,534.5891,534.8753,534.5891,534.807,0.656
2016-08-09 12:44:00,534.8014,534.878,534.8014,534.8416,0.502
2016-08-09 12:45:00,534.8131,534.8131,534.2303,534.6736,0.552
2016-08-09 12:47:00,534.756,538.5999,534.756,534.7836,0.62647241
2016-08-09 12:48:00,536.0557,536.6864,536.0557,536.6864,1.2614
2016-08-09 12:49:00,536.8966,537.7289,536.8966,537.7289,0.532
2016-08-09 12:50:00,537.9829,539.2199,537.9829,539.2199,0.67752932
2016-08-09 12:51:00,538.5,539.2199,538.5,539.2199,0.43768953
我想将它重新采样到 5 分钟的 OHCLV,所以我做了这个代码:
import pandas as pd
df= pd.read_csv("C:\Users\Araujo's PC\Desktop\python_scripts\CSV\cex_btc.csv",
names=['timestamps','open','high','low','close','volume'])
df.set_index('timestamps',inplace=True)
ohlc_dict = {
'open':'first',
'high':'max',
'low':'min',
'close':'last',
'volume':'sum'
}
df.resample('5T', how=ohlc_dict)
print df
我看来是这个错误:
TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'Index'
有人可以帮我解决这个问题吗?
您只需将 timestamps
列中的值转换为 pandas 时间戳,然后再使用它们的值设置索引。我相信它们目前只是文本字段。
df['timestamps'] = pd.to_datetime(df['timestamps'])
df.set_index('timestamps', inplace=True)
>>> df.resample('5T', how=ohlc_dict)
high close open low volume
timestamps
2016-08-09 12:35:00 536.7849 536.7849 536.7841 536.6141 0.656000
2016-08-09 12:40:00 536.6749 534.8416 536.6749 534.1801 2.277200
2016-08-09 12:45:00 538.5999 537.7289 534.8131 534.2303 2.971872
2016-08-09 12:50:00 539.2199 539.2199 537.9829 537.9829 1.115219
您也可以尝试在读取 csv 时解析这些:
pd.read_csv(filename, parse_dates=['timestamps'],
names=['timestamps','open','high','low','close','volume'])