R:仅当先前的响应是特定类型时才获取响应的平均值

R: get the mean of responses only if previous response is of a a specific type

一段时间以来,我一直在为获得条件下的响应均值而感到困惑,我将不胜感激此刻头脑清晰的任何帮助。

    Trial <- c("1", "1", "2", "2", "3", "3", "4", "4","5", "5", "6", "6", "7", "7", "8", "8", "9", "9", "10", "10") 
    Session <- c("2", "6", "2", "6", "2", "6", "2", "6", "2", "6", "2", "6", "2", "6", "2", "6", "2", "6", "2", "6") 
    Type <- c("x", "x", "x", "x", "y", "y", "x", "x", "y", "y", "y", "y", "x", "x", "y", "y", "y", "y", "x", "x") 
    Response <- c("3", "2", "2", "4", "2", "4", "6", "1", "3", "4", "2", "5", "1", "6", "5", "4", "6", "1", "3", "4") 
    df <- data.frame(Trial, Session, Type, Response)

我有几个会话的响应。如何获得 类型 x 会话 2 的“响应”的平均值,但 仅当先前的“响应”第 6 节课并输入 y?

预期输出只是平均响应(数字)。

感谢您的宝贵时间。如果需要其他信息,请告诉我。

您可以使用 dplyr::lag 获取条件语句的滞后向量:

 mean(df$Response[which(df$Session == 2 & 
                        df$Type == "x" & 
                        dplyr::lag(df$Session) == 6 &
                        dplyr::lag(df$Type) == "y")])
#> [1] 3.333333

reprex package (v2.0.1)

于 2022-04-03 创建

可复制格式的数据

df <- data.frame(Trial = rep(1:10, each = 2),
                 Session = rep(c(2, 6), 10),
                 Type = rep(rep(c("x", "y"), len = 7), 
                            times = c(4, 2, 2, 4, 2, 4, 2)),
                 Response = c(2, 4:6, 3, 2, 3, 3, 4, 2, 3, 4, 5, 2, 2, 3, 3,
                              4, 2, 3))

df
#>    Trial Session Type Response
#> 1      1       2    x        2
#> 2      1       6    x        4
#> 3      2       2    x        5
#> 4      2       6    x        6
#> 5      3       2    y        3
#> 6      3       6    y        2
#> 7      4       2    x        3
#> 8      4       6    x        3
#> 9      5       2    y        4
#> 10     5       6    y        2
#> 11     6       2    y        3
#> 12     6       6    y        4
#> 13     7       2    x        5
#> 14     7       6    x        2
#> 15     8       2    y        2
#> 16     8       6    y        3
#> 17     9       2    y        3
#> 18     9       6    y        4
#> 19    10       2    x        2
#> 20    10       6    x        3

只是为了好玩这里是另一种方法:条件相同:

有趣的是,如果我们替换

mutate(mean = ifelse(x == TRUE, sum(Response[x==TRUE])/ nrow(df[x==TRUE, ]), NA))

来自

mutate(mean = ifelse(x == TRUE, mean(Response), NA)) 我们将得到 mean = 3.25

library(dplyr)
df %>% 
  mutate(x = case_when(
    Session == 2 & 
      Type == "x" & 
      lag(Session) == 6 &
      lag(Type) == "y" ~ TRUE,
    TRUE ~ FALSE
  )) %>% 
  mutate(mean = ifelse(x == TRUE, sum(Response[x==TRUE])/
                                        nrow(df[x==TRUE, ]), NA)) %>% 
  filter (., is.na(mean)==FALSE) %>% 
  distinct(mean)
      mean
1 3.333333