在时间序列中多次查找特定值

Question

我试图在时间序列中多次查找特定值，在本例中为 0。数据看起来像这样

structure(list(time = c(40, 41, 42, 43, 44, 44.9, 45.9, 46.9, 47.9, 48.9, 49.9, 50.8, 51.8, 52.8, 53.7, 54.6, 55.6, 56.5, 57.5, 58.5, 59.5, 60.5, 61.5, 62.5, 63.5, 64.5, 65.5, 66.5, 67.5, 68.5, 69.5, 70.5, 71.5, 72.5, 73.5, 74.5, 75.5, 76.4, 77.3, 78.3, 79.3, 80.3, 81.2, 82.2, 83.2, 84.2, 85.2, 86.2, 87.2, 88.2, 89.2, 90.2, 91.2, 92.2, 93.2, 94.2, 95.2, 96.2, 97.2, 98.2, 99.2, 100.2, 101.2, 102, 103, 103.9, 104.9, 105.9, 106.8, 107.8, 108.8, 109.8, 110.8, 111.8, 112.8, 113.8, 114.4, 114.9, 115.8, 116.8, 117.8, 118.8, 119.8), value = c(33.6, 33.6, 33.6, 33.6, 33.6, 33.6, 34, 34, 34.4, 34.72, 29.12, 34.8, 19.04, 30.32, 1.36, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.48, 28.64, 32, 32, 32, 32, 32, 32, 32, 0, 0, 0, 0, 0, 0, 0, 0, 0, 7.68, 31.12, 32, 32, 31.6, 31.6, 31.6, 31.6, 31.6, 31.6, 31.6, 31.6, 31.6, 32, 32, 32, 2.8, 0, 0, 0, 0, 0, 0, 0, 0, 22.16, 32, 31.92, 31.92, 38.8, 0, 0, 0, 0, 0, 0, 0)), row.names = c(NA, -83L), class = "data.frame")

问题是我想找到 value 中第一个零出现一段时间的时间。所以这个过程就像：如果 value 下降到零，请给我 time 中这一点的时间，然后如果 value 上升到 time 然后再次下降归零给我时间 time 等等... 绘制数据应该有助于理解问题。

所以 time 的结果应该是这样的：54.6, 76.4, 102, 114.4

编辑：我不知道它是否重要，但原始数据在 data.table.

Answer 1

可能有更优雅的方法来做到这一点。在这里，我将值的累加和设为零，然后计算“运行长度”。也许最容易理解运行逐段运行代码 inside-out。一言以蔽之，我用rle和cumsum来解决

o <- structure(list(time = c(40, 41, 42, 43, 44, 44.9, 45.9, 46.9,  47.9, 48.9, 49.9, 50.8, 51.8, 52.8, 53.7, 54.6, 55.6, 56.5, 57.5,  58.5, 59.5, 60.5, 61.5, 62.5, 63.5, 64.5, 65.5, 66.5, 67.5, 68.5,  69.5, 70.5, 71.5, 72.5, 73.5, 74.5, 75.5, 76.4, 77.3, 78.3, 79.3,  80.3, 81.2, 82.2, 83.2, 84.2, 85.2, 86.2, 87.2, 88.2, 89.2, 90.2,  91.2, 92.2, 93.2, 94.2, 95.2, 96.2, 97.2, 98.2, 99.2, 100.2,  101.2, 102, 103, 103.9, 104.9, 105.9, 106.8, 107.8, 108.8, 109.8,  110.8, 111.8, 112.8, 113.8, 114.4, 114.9, 115.8, 116.8, 117.8,  118.8, 119.8), value = c(33.6, 33.6, 33.6, 33.6, 33.6,  33.6, 34, 34, 34.4, 34.72, 29.12, 34.8, 19.04, 30.32, 1.36, 0,  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.48, 28.64, 32, 32, 32,  32, 32, 32, 32, 0, 0, 0, 0, 0, 0, 0, 0, 0, 7.68, 31.12, 32, 32,  31.6, 31.6, 31.6, 31.6, 31.6, 31.6, 31.6, 31.6, 31.6, 32, 32,  32, 2.8, 0, 0, 0, 0, 0, 0, 0, 0, 22.16, 32, 31.92, 31.92, 38.8,  0, 0, 0, 0, 0, 0, 0)), row.names = c(NA, -83L), class = "data.frame")

tmp <- rle(cumsum(o$value == 0))$lengths

o[cumsum(tmp)[tmp > 1] + 1,"time"]
[1]  54.6  76.4 102.0 114.4

Answer 2

tidyverse 和 rleid 来自 data.table

的选项

library(data.table)
library(dplyr)
o %>% 
     group_by(grp = rleid(value == 0)) %>% 
     filter(value == 0) %>% 
     group_by(grp) %>%
     slice(1) %>% 
     ungroup %>% 
     select(time)
# A tibble: 4 x 1
#   time
#  <dbl>
#1  54.6
#2  76.4
#3 102  
#4 114.

在时间序列中多次查找特定值

Find a specific value multiple times in a time series

r

time-series

detection