找到系列中的第一个触发器并从该点开始取平均值

Finding the first trigger in a series and taking average from that point on

希望您一切顺利。我希望得到一些帮助来分析一些代码,在这些代码中我根据开始触发器确定了一系列试验(但忽略了紧随其后的直接触发器)。在下面的示例中,我想找到一系列 1 中的第一个 1,然后取 Value_1 和 Value_2 中接下来三个数字的平均值。然后它应该找到下一个开始周期(下一组 1 的第 8 个值)并再次取以下 3 个值的平均值,依此类推。感谢您的帮助,我很乐意回答任何问题。

df <- data.frame(Value_1 = c(1,2,3,4,5,6,7,8,9,10,1,2,3,4,5,6,7,8,9,10), Value_2 = c(10,2,3,4,5,6,7,8,10,10,1,2,3,4,5,6,7,8,9,10), Triggers = c(0,1,1,1,0,0,0,1,1,1,0,0,0,0,0,1,1,1,0,0))

在下面的 updated_df 示例中,我希望代码能够解决触发值中可能出现的中断(例如,1 列表中的 0)并找到一组中的第一个 1 1 和可能的零,然后取 Value_1 和 Value_2 中接下来四个数字的平均值。然后它应该找到下一个开始周期(下一组 1 和 0 的第 9 个值)并再次取以下 4 个值的平均值,依此类推。感谢您的帮助,我很乐意回答任何问题。

updated_df <-df <- data.frame(
  Value_1 = c(1,2,3,3,4,5,6,7,8,9,9,10,1,2,3,4,5,6,7,8,9,9,10),
  Value_2 = c(10,2,3,3,4,5,6,7,8,10,10,10,1,2,3,4,5,6,6,7,8,9,10),
  Triggers = c(0,1,1,0,1,0,0,0,1,1,0,1,0,0,0,0,0,1,0,1,1,0,0)
)

这是处理更新问题(“触发值中断”)的基本 R 解决方案。它包括一个基于 this SO answer.

的滞后函数
updated_df <- data.frame(
  Value_1 = c(1,2,3,3,4,5,6,7,8,9,9,10,1,2,3,4,5,6,7,8,9,9,10),
  Value_2 = c(10,2,3,3,4,5,6,7,8,10,10,10,1,2,3,4,5,6,6,7,8,9,10),
  Triggers = c(0,1,1,0,1,0,0,0,1,1,0,1,0,0,0,0,0,1,0,1,1,0,0)
)

# lag function, based on @Andrew's answer at 
# 
lag_fx <- function(x, by = 1L, default = NA) {
  if (by < 0 || !isTRUE(all.equal(by, round(by)))) {
    stop("`by` should be a whole number >= 0")
  }
  c(rep(default, by), x)[1:length(x)]
}

# number of trials per set
set_k <- 4

### to find index of each start trigger:
# (1) make matrix to "look back" at previous k - 1 trials
lagged <- sapply(
  1:(set_k - 1), 
  \(x) lag_fx(updated_df$Triggers, by = x, default = 0)
)

# (2) then find rows where trigger == 1, but no 1s in previous k - 1 trials
starts <- which(updated_df$Triggers == 1 & rowSums(lagged) == 0)

# indices of each trigger and following k - 1 rows
sets <- lapply(starts, \(x) x + 0:(set_k - 1))

# means of each set of trials
Value_1 <- sapply(sets, \(x) mean(updated_df$Value_1[x]))
Value_2 <- sapply(sets, \(x) mean(updated_df$Value_2[x]))

# back to a data.frame
data.frame(Value_1, Value_2)

#   Value_1 Value_2
# 1     3.0    3.00
# 2     9.0    9.50
# 3     7.5    6.75