带有 tsfresh 的特征数据框

Question

我有一个 pandas 数据框，看起来像这样：

            time  000010  000017  000033      000034      000041     000042  \

0     672.246427     NaN     NaN     NaN  122.812927  367.110779  75.933125   
1     672.253247     NaN     NaN     NaN  126.228996  372.775421  78.117798   
2     672.260270     NaN     NaN     NaN  126.909046  369.460754  77.109196   
3     672.267205     NaN     NaN     NaN  129.729416  376.499878  76.996864   
4     672.274120     NaN     NaN     NaN  126.082420  380.343506  76.199158   
5     672.281085     NaN     NaN     NaN  127.412136  387.227203  78.589165   
6     672.288012     NaN     NaN     NaN  131.672180  394.507355  83.319740   
7     672.294974     NaN     NaN     NaN  128.294861  390.472992  78.814026   
8     672.301931     NaN     NaN     NaN  134.104858  393.601486  82.421974   
9     672.308877     NaN     NaN     NaN  119.213364  393.934875  80.444237   
10    672.315816     NaN     NaN     NaN  126.745148  378.437531  79.340736   
11    672.322750     NaN     NaN     NaN  114.940750  367.477142  76.719002   
12    672.329622     NaN     NaN     NaN  118.000877  364.089691  74.932938

我打算将其与模块 'tsfresh' 一起使用以提取特征。编号列 header 是 object ID，时间列是时间序列。

此数据框称为 'data'，因此我正在尝试使用提取特征命令：

extracted_features = extract_features(data, column_id = objs[1:], column_sort = "time")

其中 objs[1:] 这里是 object 列右侧的 ID header "time".

此错误与 'The truth value of an array with more than one element is ambiguous' 有关，但任何人都可以帮助我完成这项工作并提取一个不错的 pandas 特征数据框吗？

非常感谢！

Answer 1

也许我误解了你的问题，但是（当我理解正确时），你需要以一种形式重新排序你的数据框，tsfresh 可以理解它。

column_id 假设（正如其名称所暗示的那样）一个带有 ID 列的列名——您没有。如果我没看错的话，你只有 6 个不同的 ID（000010、000017、000033、000034、000041、000042），每个 ID 都有 13 个只有一种的时间序列值（我们称之为 data）。所以tsfresh想要一个dataframe，看起来像

  id     kind  value       time
000034   data  122.812927  672.246427
...
000041   data  367.110779  672.246427   
...

然后您可以使用

将其输入 tsfresh

extract_features(df, column_id="id", column_kind="kind", 
                 column_value="value", column_sort="time")

此外，您需要去掉 NaN 列（因为 tsfresh 不知道如何处理它们）。

请查看我们关于数据格式的文档：http://tsfresh.readthedocs.io/en/latest/text/data_formats.html

带有 tsfresh 的特征数据框

Features dataframe with tsfresh

python

time-series

frame

ambiguous