筛选 static/stationary 个区域

Filtering static/stationary areas

我试图过滤我的传感器数据。我的 objective 是过滤传感器数据,其中数据在一段时间内或多或少是静止的。谁能帮我解决这个问题

time : 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20  

sensor : 121
115
122
123
116
117
113
116
113
114
115
112
116
129
123
125
130
120
121
122

这是一个示例数据,我需要获取第一个数据并将其与接下来的 20 秒数据进行比较,如果所有 20 个数据都在 + 或 - 10 的范围内,那么我需要过滤这些20个数据到另一列,我需要继续这个过滤过程

没有示例数据。生成。显然按时间过滤可能是两个约会时间,我刚刚选择了某些时间。对于稳定的示例,选择的值介于第 45 个和第 55 个百分位之间。

import numpy as np
t = pd.date_range(dt.date(2021,1,10), dt.date(2021,1,11), freq="min")
df = pd.DataFrame({"time":t, "val":np.random.dirichlet(np.ones(len(t)),size=1)[0]})
# filter on hour and val.  val between 45th and 55th percentile
df2 = df[df.time.dt.hour.between(3,4) & df.val.between(df.val.quantile(.45), df.val.quantile(.55))]

输出

               time      val
2021-01-10 03:13:00 0.000499
2021-01-10 03:41:00 0.000512
2021-01-10 04:00:00 0.000541
2021-01-10 04:39:00 0.000413

滚动window

问题已更新为状态 稳定 被定义为接下来 window 行,在新列中有 +/- rng 输出。

使用此定义,使用 rolling() 功能和 lambda 函数来检查 window 中的所有后续行是否在 [=38= 中第一个观察值的容差水平内].超出此范围的任何观测值都将 return NaN。另请注意,最后一行将 return NaN 因为没有足够的剩余行来进行测试。

import pandas as pd
import io
import datetime as dt
import numpy as np
from distutils.version import StrictVersion
df = pd.read_csv(io.StringIO("""sensor
121
115
122
123
116
117
113
116
113
114
115
112
116
129
123
125
130
120
121
122"""))
df["time"] = pd.date_range(dt.date(2021,1,10), freq="s", periods=len(df))

# how many rows to compare
window = 5
# */- range
rng = 10
if StrictVersion(pd.__version__) < StrictVersion("1.0.0"):
    df["stable"] = df["sensor"].rolling(window).apply(lambda x: np.where(pd.Series(x).between(x[0]-rng,x[0]+rng).all(), x[0], np.nan)).shift(-(window-1))
else:
    df["stable"] = df.rolling(window).apply(lambda x: np.where(x.between(x.values[0]-rng,x.values[0]+rng).all(), x.values[0], np.nan)).shift(-(window-1))

输出

 sensor                time  stable
    121 2021-01-10 00:00:00   121.0
    115 2021-01-10 00:00:01   115.0
    122 2021-01-10 00:00:02   122.0
    123 2021-01-10 00:00:03   123.0
    116 2021-01-10 00:00:04   116.0
    117 2021-01-10 00:00:05   117.0
    113 2021-01-10 00:00:06   113.0
    116 2021-01-10 00:00:07   116.0
    113 2021-01-10 00:00:08   113.0
    114 2021-01-10 00:00:09     NaN
    115 2021-01-10 00:00:10     NaN
    112 2021-01-10 00:00:11     NaN
    116 2021-01-10 00:00:12     NaN
    129 2021-01-10 00:00:13   129.0
    123 2021-01-10 00:00:14   123.0
    125 2021-01-10 00:00:15   125.0
    130 2021-01-10 00:00:16     NaN
    120 2021-01-10 00:00:17     NaN
    121 2021-01-10 00:00:18     NaN
    122 2021-01-10 00:00:19     NaN

但是你的问题不是很清楚,但根据我的理解,你想要的是 20 秒的持续时间,如果传感器在第一次读数的 +10 和 -10 范围之间,那么你必须附加那些不应考虑新列的值以及高于或低于该值的值。我试过复制你的 DataFrame,你可以这样继续:

import pandas as pd
data = {'time':[1, 2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23], 
        'sensor':[121, 115, 122, 123,116,117,113,116,113,114,115,112,116,129,123,125,130,120,121,122,123,124,144]}

df_new = pd.DataFrame(data) #I am taking time duration of 23 seconds where 23rd second data is out of range as 144 - 121 > 10

    time  sensor
0      1     121
1      2     115
2      3     122
3      4     123
4      5     116
5      6     117
6      7     113
7      8     116
8      9     113
9     10     114
10    11     115
11    12     112
12    13     116
13    14     129
14    15     123
15    16     125
16    17     130
17    18     120
18    19     121
19    20     122
20    21     123
21    22     124
22    23     144

list = []
for i in range(0, len(df_new['sensor'])):
    if 0 <= df_new['time'][i] - df_new['time'][0] <= 23: #you take here 20 which is your requirement instead of 23 as I am doing to demonstrate for the value of 144
        if -10 < df_new['sensor'][0] - df_new['sensor'][i] < 10:
            list.append(df_new['sensor'][i])
        else:
            list.append('out of range')
    else:
        break

df_new['result'] = list

df_new

    time  sensor          result
0      1     121           121
1      2     115           115
2      3     122           122
3      4     123           123
4      5     116           116
5      6     117           117
6      7     113           113
7      8     116           116
8      9     113           113
9     10     114           114
10    11     115           115
11    12     112           112
12    13     116           116
13    14     129           129
14    15     123           123
15    16     125           125
16    17     130           130
17    18     120           120
18    19     121           121
19    20     122           122
20    21     123           123
21    22     124           124
22    23     144  out of range