如何使用tsfresh python包从时间序列数据中提取特征?
how to use tsfresh python package to extract features from time series data?
我有一个列表列表,其中每个列表代表一个时间序列:
tsli=[[43,65,23,765,233,455,7,32,57,78,4,32],[34,32,565,87,23,86,32,56,32,57,78,32],[87,43,12,46,32,46,13,23,6,90,67,8],[1,2,3,3,4,5,6,7,8,9,0,9],[12,34,56,76,34,12,45,67,34,21,12,22]]
我想使用 tsfresh 包从这个数据集中提取特征,使用代码:
import tsfresh
tf=tsfresh.extract_features(tsli)
当我 运行 时,我收到值错误,即:
> ValueError: You have to set the column_id which contains the ids of the different time series
But i don't know how to deal with this and how to define column id for this problem.
编辑 1:
正如建议的那样,我尝试将数据集转换为数据然后尝试:
import tsfresh
df=pd.DataFrame(tsli)
tf=tsfresh.extract_features(df)
但值错误相同
> ValueError: You have to set the column_id which contains the ids of the different time series
任何资源或参考资料都会有所帮助。
谢谢
首先,您必须将 list
转换为 dataframe
,其中每个时间序列都有一个唯一的 ID,例如
df = pd.DataFrame()
for i, ts in enumerate(tsli):
data = [[x, i] for x in ts]
df = df.append(data, ignore_index=True)
df.columns = ['value', 'id']
...
现在您可以在创建的列上使用带有 column_id
参数的 tsfresh:
tf=tsfresh.extract_features(df, column_id='id')
>> Feature Extraction: 100%|██████████| 5/5 [00:00<00:00, 36.83it/s]
另一个例子:tsfresh Quick Start
我有一个列表列表,其中每个列表代表一个时间序列:
tsli=[[43,65,23,765,233,455,7,32,57,78,4,32],[34,32,565,87,23,86,32,56,32,57,78,32],[87,43,12,46,32,46,13,23,6,90,67,8],[1,2,3,3,4,5,6,7,8,9,0,9],[12,34,56,76,34,12,45,67,34,21,12,22]]
我想使用 tsfresh 包从这个数据集中提取特征,使用代码:
import tsfresh
tf=tsfresh.extract_features(tsli)
当我 运行 时,我收到值错误,即:
> ValueError: You have to set the column_id which contains the ids of the different time series
But i don't know how to deal with this and how to define column id for this problem.
编辑 1: 正如建议的那样,我尝试将数据集转换为数据然后尝试:
import tsfresh
df=pd.DataFrame(tsli)
tf=tsfresh.extract_features(df)
但值错误相同
> ValueError: You have to set the column_id which contains the ids of the different time series
任何资源或参考资料都会有所帮助。
谢谢
首先,您必须将 list
转换为 dataframe
,其中每个时间序列都有一个唯一的 ID,例如
df = pd.DataFrame()
for i, ts in enumerate(tsli):
data = [[x, i] for x in ts]
df = df.append(data, ignore_index=True)
df.columns = ['value', 'id']
现在您可以在创建的列上使用带有 column_id
参数的 tsfresh:
tf=tsfresh.extract_features(df, column_id='id')
>> Feature Extraction: 100%|██████████| 5/5 [00:00<00:00, 36.83it/s]
另一个例子:tsfresh Quick Start