将 pandas 移动 window 与列表进行比较,以找到错误最少的 window
Compare a pandas moving window to a list to find the window with least error
我已将我的数据集缩减到最后几个步骤。我的 pandas 数据框看起来像这样
FAC
0 1
1 2
2 1
3 3
4 2
5 1
6 2
7 1
8 1
9 3
10 2
11 1
12 2
13 3
14 1
我还有一个我确定匹配的列表。
match_list = [1, 2, 1, 1, 3]
我正在寻找的是幻灯片(5 项 window)数据框列并找到与列表模式匹配的行。最终结果看起来像这样。如果有任何帮助,我将不胜感激。
FAC Error
0 1 some val
1 2 some val
2 1 some val
3 3 some val
4 2 some val
5 1 some val
6 2 some val
7 1 0
8 1 some val
9 3 some val
10 2 some val
11 1 some val
12 2 some val
13 3 some val
14 1 some val
这可以用rolling
来完成:
match_list = [1, 2, 1, 1, 3]
match_list = np.array(match_list)
def match(x):
return (len(x)==len(match_list) and (x==match_list).all())
df['error'] = np.where(df.FAC.rolling(5, center=True).apply(match)==1, 0, 'some value')
输出:
FAC error
0 1 some value
1 2 some value
2 1 some value
3 3 some value
4 2 some value
5 1 some value
6 2 some value
7 1 0
8 1 some value
9 3 some value
10 2 some value
11 1 some value
12 2 some value
13 3 some value
14 1 some value
Update:要对匹配进行计数,您只需在函数内执行 mean
而不是 all
:
def count_match(x):
return (len(x)==len(match_list))* (x==match_list).mean()
df['error'] = df.FAC.rolling(5,center=True).apply(count_match)
输出:
FAC error
0 1 NaN
1 2 NaN
2 1 0.6
3 3 0.0
4 2 0.4
5 1 0.4
6 2 0.2
7 1 1.0
8 1 0.2
9 3 0.2
10 2 0.4
11 1 0.6
12 2 0.0
13 3 NaN
14 1 NaN
我已将我的数据集缩减到最后几个步骤。我的 pandas 数据框看起来像这样
FAC
0 1
1 2
2 1
3 3
4 2
5 1
6 2
7 1
8 1
9 3
10 2
11 1
12 2
13 3
14 1
我还有一个我确定匹配的列表。
match_list = [1, 2, 1, 1, 3]
我正在寻找的是幻灯片(5 项 window)数据框列并找到与列表模式匹配的行。最终结果看起来像这样。如果有任何帮助,我将不胜感激。
FAC Error
0 1 some val
1 2 some val
2 1 some val
3 3 some val
4 2 some val
5 1 some val
6 2 some val
7 1 0
8 1 some val
9 3 some val
10 2 some val
11 1 some val
12 2 some val
13 3 some val
14 1 some val
这可以用rolling
来完成:
match_list = [1, 2, 1, 1, 3]
match_list = np.array(match_list)
def match(x):
return (len(x)==len(match_list) and (x==match_list).all())
df['error'] = np.where(df.FAC.rolling(5, center=True).apply(match)==1, 0, 'some value')
输出:
FAC error
0 1 some value
1 2 some value
2 1 some value
3 3 some value
4 2 some value
5 1 some value
6 2 some value
7 1 0
8 1 some value
9 3 some value
10 2 some value
11 1 some value
12 2 some value
13 3 some value
14 1 some value
Update:要对匹配进行计数,您只需在函数内执行 mean
而不是 all
:
def count_match(x):
return (len(x)==len(match_list))* (x==match_list).mean()
df['error'] = df.FAC.rolling(5,center=True).apply(count_match)
输出:
FAC error
0 1 NaN
1 2 NaN
2 1 0.6
3 3 0.0
4 2 0.4
5 1 0.4
6 2 0.2
7 1 1.0
8 1 0.2
9 3 0.2
10 2 0.4
11 1 0.6
12 2 0.0
13 3 NaN
14 1 NaN