在时间序列中多次查找特定值
Find a specific value multiple times in a time series
我试图在时间序列中多次查找特定值,在本例中为 0。数据看起来像这样
structure(list(time = c(40, 41, 42, 43, 44, 44.9, 45.9, 46.9, 47.9, 48.9, 49.9, 50.8, 51.8, 52.8, 53.7, 54.6, 55.6, 56.5, 57.5, 58.5, 59.5, 60.5, 61.5, 62.5, 63.5, 64.5, 65.5, 66.5, 67.5, 68.5, 69.5, 70.5, 71.5, 72.5, 73.5, 74.5, 75.5, 76.4, 77.3, 78.3, 79.3, 80.3, 81.2, 82.2, 83.2, 84.2, 85.2, 86.2, 87.2, 88.2, 89.2, 90.2, 91.2, 92.2, 93.2, 94.2, 95.2, 96.2, 97.2, 98.2, 99.2, 100.2, 101.2, 102, 103, 103.9, 104.9, 105.9, 106.8, 107.8, 108.8, 109.8, 110.8, 111.8, 112.8, 113.8, 114.4, 114.9, 115.8, 116.8, 117.8, 118.8, 119.8), value = c(33.6, 33.6, 33.6, 33.6, 33.6, 33.6, 34, 34, 34.4, 34.72, 29.12, 34.8, 19.04, 30.32, 1.36, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.48, 28.64, 32, 32, 32, 32, 32, 32, 32, 0, 0, 0, 0, 0, 0, 0, 0, 0, 7.68, 31.12, 32, 32, 31.6, 31.6, 31.6, 31.6, 31.6, 31.6, 31.6, 31.6, 31.6, 32, 32, 32, 2.8, 0, 0, 0, 0, 0, 0, 0, 0, 22.16, 32, 31.92, 31.92, 38.8, 0, 0, 0, 0, 0, 0, 0)), row.names = c(NA, -83L), class = "data.frame")
问题是我想找到 value
中第一个零出现一段时间的时间。所以这个过程就像:如果 value
下降到零,请给我 time
中这一点的时间,然后如果 value
上升到 time
然后再次下降归零给我时间 time
等等...
绘制数据应该有助于理解问题。
所以 time
的结果应该是这样的:54.6, 76.4, 102, 114.4
编辑:我不知道它是否重要,但原始数据在 data.table.
可能有更优雅的方法来做到这一点。在这里,我将值的累加和设为零,然后计算“运行 长度”。也许最容易理解 运行 逐段运行代码 inside-out。一言以蔽之,我用rle
和cumsum
来解决
o <- structure(list(time = c(40, 41, 42, 43, 44, 44.9, 45.9, 46.9, 47.9, 48.9, 49.9, 50.8, 51.8, 52.8, 53.7, 54.6, 55.6, 56.5, 57.5, 58.5, 59.5, 60.5, 61.5, 62.5, 63.5, 64.5, 65.5, 66.5, 67.5, 68.5, 69.5, 70.5, 71.5, 72.5, 73.5, 74.5, 75.5, 76.4, 77.3, 78.3, 79.3, 80.3, 81.2, 82.2, 83.2, 84.2, 85.2, 86.2, 87.2, 88.2, 89.2, 90.2, 91.2, 92.2, 93.2, 94.2, 95.2, 96.2, 97.2, 98.2, 99.2, 100.2, 101.2, 102, 103, 103.9, 104.9, 105.9, 106.8, 107.8, 108.8, 109.8, 110.8, 111.8, 112.8, 113.8, 114.4, 114.9, 115.8, 116.8, 117.8, 118.8, 119.8), value = c(33.6, 33.6, 33.6, 33.6, 33.6, 33.6, 34, 34, 34.4, 34.72, 29.12, 34.8, 19.04, 30.32, 1.36, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.48, 28.64, 32, 32, 32, 32, 32, 32, 32, 0, 0, 0, 0, 0, 0, 0, 0, 0, 7.68, 31.12, 32, 32, 31.6, 31.6, 31.6, 31.6, 31.6, 31.6, 31.6, 31.6, 31.6, 32, 32, 32, 2.8, 0, 0, 0, 0, 0, 0, 0, 0, 22.16, 32, 31.92, 31.92, 38.8, 0, 0, 0, 0, 0, 0, 0)), row.names = c(NA, -83L), class = "data.frame")
tmp <- rle(cumsum(o$value == 0))$lengths
o[cumsum(tmp)[tmp > 1] + 1,"time"]
[1] 54.6 76.4 102.0 114.4
tidyverse
和 rleid
来自 data.table
的选项
library(data.table)
library(dplyr)
o %>%
group_by(grp = rleid(value == 0)) %>%
filter(value == 0) %>%
group_by(grp) %>%
slice(1) %>%
ungroup %>%
select(time)
# A tibble: 4 x 1
# time
# <dbl>
#1 54.6
#2 76.4
#3 102
#4 114.
我试图在时间序列中多次查找特定值,在本例中为 0。数据看起来像这样
structure(list(time = c(40, 41, 42, 43, 44, 44.9, 45.9, 46.9, 47.9, 48.9, 49.9, 50.8, 51.8, 52.8, 53.7, 54.6, 55.6, 56.5, 57.5, 58.5, 59.5, 60.5, 61.5, 62.5, 63.5, 64.5, 65.5, 66.5, 67.5, 68.5, 69.5, 70.5, 71.5, 72.5, 73.5, 74.5, 75.5, 76.4, 77.3, 78.3, 79.3, 80.3, 81.2, 82.2, 83.2, 84.2, 85.2, 86.2, 87.2, 88.2, 89.2, 90.2, 91.2, 92.2, 93.2, 94.2, 95.2, 96.2, 97.2, 98.2, 99.2, 100.2, 101.2, 102, 103, 103.9, 104.9, 105.9, 106.8, 107.8, 108.8, 109.8, 110.8, 111.8, 112.8, 113.8, 114.4, 114.9, 115.8, 116.8, 117.8, 118.8, 119.8), value = c(33.6, 33.6, 33.6, 33.6, 33.6, 33.6, 34, 34, 34.4, 34.72, 29.12, 34.8, 19.04, 30.32, 1.36, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.48, 28.64, 32, 32, 32, 32, 32, 32, 32, 0, 0, 0, 0, 0, 0, 0, 0, 0, 7.68, 31.12, 32, 32, 31.6, 31.6, 31.6, 31.6, 31.6, 31.6, 31.6, 31.6, 31.6, 32, 32, 32, 2.8, 0, 0, 0, 0, 0, 0, 0, 0, 22.16, 32, 31.92, 31.92, 38.8, 0, 0, 0, 0, 0, 0, 0)), row.names = c(NA, -83L), class = "data.frame")
问题是我想找到 value
中第一个零出现一段时间的时间。所以这个过程就像:如果 value
下降到零,请给我 time
中这一点的时间,然后如果 value
上升到 time
然后再次下降归零给我时间 time
等等...
绘制数据应该有助于理解问题。
所以 time
的结果应该是这样的:54.6, 76.4, 102, 114.4
编辑:我不知道它是否重要,但原始数据在 data.table.
可能有更优雅的方法来做到这一点。在这里,我将值的累加和设为零,然后计算“运行 长度”。也许最容易理解 运行 逐段运行代码 inside-out。一言以蔽之,我用rle
和cumsum
来解决
o <- structure(list(time = c(40, 41, 42, 43, 44, 44.9, 45.9, 46.9, 47.9, 48.9, 49.9, 50.8, 51.8, 52.8, 53.7, 54.6, 55.6, 56.5, 57.5, 58.5, 59.5, 60.5, 61.5, 62.5, 63.5, 64.5, 65.5, 66.5, 67.5, 68.5, 69.5, 70.5, 71.5, 72.5, 73.5, 74.5, 75.5, 76.4, 77.3, 78.3, 79.3, 80.3, 81.2, 82.2, 83.2, 84.2, 85.2, 86.2, 87.2, 88.2, 89.2, 90.2, 91.2, 92.2, 93.2, 94.2, 95.2, 96.2, 97.2, 98.2, 99.2, 100.2, 101.2, 102, 103, 103.9, 104.9, 105.9, 106.8, 107.8, 108.8, 109.8, 110.8, 111.8, 112.8, 113.8, 114.4, 114.9, 115.8, 116.8, 117.8, 118.8, 119.8), value = c(33.6, 33.6, 33.6, 33.6, 33.6, 33.6, 34, 34, 34.4, 34.72, 29.12, 34.8, 19.04, 30.32, 1.36, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.48, 28.64, 32, 32, 32, 32, 32, 32, 32, 0, 0, 0, 0, 0, 0, 0, 0, 0, 7.68, 31.12, 32, 32, 31.6, 31.6, 31.6, 31.6, 31.6, 31.6, 31.6, 31.6, 31.6, 32, 32, 32, 2.8, 0, 0, 0, 0, 0, 0, 0, 0, 22.16, 32, 31.92, 31.92, 38.8, 0, 0, 0, 0, 0, 0, 0)), row.names = c(NA, -83L), class = "data.frame")
tmp <- rle(cumsum(o$value == 0))$lengths
o[cumsum(tmp)[tmp > 1] + 1,"time"]
[1] 54.6 76.4 102.0 114.4
tidyverse
和 rleid
来自 data.table
library(data.table)
library(dplyr)
o %>%
group_by(grp = rleid(value == 0)) %>%
filter(value == 0) %>%
group_by(grp) %>%
slice(1) %>%
ungroup %>%
select(time)
# A tibble: 4 x 1
# time
# <dbl>
#1 54.6
#2 76.4
#3 102
#4 114.