python 因多次重复 pandas 调用而冻结
python freezes with many repeated pandas calls
我正在尝试计算许多采样频率的时间序列的方差(所谓的签名图),我使用了在一组频率上循环的重采样方法,但 python 在完成之前停止任务(没有错误,只是冻结)。
这里的代码是
var_list = [timeseries.resample(rule=str(int(freq))+'min',how='first').var() for i in np.linspace(2,20,10)]
请注意,少了一次迭代一切都很好(除了内存和 cpu 使用率非常低,我不明白为什么再多一次它不成功)
[编辑]
http://www.filedropper.com/14081
import pandas as pd
data = pd.io.parsers.read_csv(filepath_or_buffer="/media/snake91/01D05438A403F9501/Econometrics/"+datatype+"/"+csvinfile+".csv", sep=',',
decimal='.',usecols=["Date","Time","Close"], keep_date_col=True)
data['DateTime'] = data.apply(lambda row: datetime.datetime.strptime(row['Date']+ ' ' + row['Time'], '%d/%m/%Y %H:%M'), axis=1)
data.set_index('DateTime', inplace=True)
price = data["Close"]
我只使用 'Close' 列
[编辑 2]
不断尝试后我得到了这个
*** Error in `/usr/bin/python': double free or corruption (out): 0x00000000030ba810 ***
有错误吗?
你的方法看起来有点复杂......我希望我的简化是你需要的......
# get an index of pandas Timestamps
df.index = pd.to_datetime(df.Date + ' ' + df.Time)
# get the column we want as a pandas Series called price
price = df['Close']
更新
# use a list comprehension to construct a list of variances,
# for the various resampling periods
var_list = [price.resample(str(int(i))+'min', how='first').var()
for i in np.linspace(2,20,10)]
产生了...
In [10]: var_list
Out[10]:
[0.077889810612269461,
0.077385129726302446,
0.079956521234607447,
0.077604408646643086,
0.077813415563354235,
0.080675086585717218,
0.074652971598985707,
0.0763870569776786,
0.076195162549351256,
0.076852363707017035]
字典形式...
In [11]: %paste
# use a comprehension to construct a dictionary, each entry of which
# has the variance for each resampling period
var_dic = {i: price.resample(str(int(i))+'min', how='first').var()
for i in np.linspace(2,20,10)}
## -- End pasted text --
In [12]: var_dic
Out[12]:
{2.0: 0.077889810612269461,
4.0: 0.077385129726302446,
6.0: 0.079956521234607447,
8.0: 0.077604408646643086,
10.0: 0.077813415563354235,
12.0: 0.080675086585717218,
14.0: 0.074652971598985707,
16.0: 0.0763870569776786,
18.0: 0.076195162549351256,
20.0: 0.076852363707017035}
我正在尝试计算许多采样频率的时间序列的方差(所谓的签名图),我使用了在一组频率上循环的重采样方法,但 python 在完成之前停止任务(没有错误,只是冻结)。 这里的代码是
var_list = [timeseries.resample(rule=str(int(freq))+'min',how='first').var() for i in np.linspace(2,20,10)]
请注意,少了一次迭代一切都很好(除了内存和 cpu 使用率非常低,我不明白为什么再多一次它不成功)
[编辑] http://www.filedropper.com/14081
import pandas as pd
data = pd.io.parsers.read_csv(filepath_or_buffer="/media/snake91/01D05438A403F9501/Econometrics/"+datatype+"/"+csvinfile+".csv", sep=',',
decimal='.',usecols=["Date","Time","Close"], keep_date_col=True)
data['DateTime'] = data.apply(lambda row: datetime.datetime.strptime(row['Date']+ ' ' + row['Time'], '%d/%m/%Y %H:%M'), axis=1)
data.set_index('DateTime', inplace=True)
price = data["Close"]
我只使用 'Close' 列
[编辑 2] 不断尝试后我得到了这个
*** Error in `/usr/bin/python': double free or corruption (out): 0x00000000030ba810 ***
有错误吗?
你的方法看起来有点复杂......我希望我的简化是你需要的......
# get an index of pandas Timestamps
df.index = pd.to_datetime(df.Date + ' ' + df.Time)
# get the column we want as a pandas Series called price
price = df['Close']
更新
# use a list comprehension to construct a list of variances,
# for the various resampling periods
var_list = [price.resample(str(int(i))+'min', how='first').var()
for i in np.linspace(2,20,10)]
产生了...
In [10]: var_list
Out[10]:
[0.077889810612269461,
0.077385129726302446,
0.079956521234607447,
0.077604408646643086,
0.077813415563354235,
0.080675086585717218,
0.074652971598985707,
0.0763870569776786,
0.076195162549351256,
0.076852363707017035]
字典形式...
In [11]: %paste
# use a comprehension to construct a dictionary, each entry of which
# has the variance for each resampling period
var_dic = {i: price.resample(str(int(i))+'min', how='first').var()
for i in np.linspace(2,20,10)}
## -- End pasted text --
In [12]: var_dic
Out[12]:
{2.0: 0.077889810612269461,
4.0: 0.077385129726302446,
6.0: 0.079956521234607447,
8.0: 0.077604408646643086,
10.0: 0.077813415563354235,
12.0: 0.080675086585717218,
14.0: 0.074652971598985707,
16.0: 0.0763870569776786,
18.0: 0.076195162549351256,
20.0: 0.076852363707017035}