根据先前发生的条件更新

Question

我有一个数据框

  stim1 stim2 Chosen Rejected
1:     2     1      2        1
2:     3     2      2        3
3:     3     1      1        3
4:     2     3      3        2
5:     1     3      1        3

我的 objective 在每次试验中添加一列，指定刺激是最近（在以前的试验中）选择还是拒绝。

期望的结果

  stim1 stim2 Chosen Rejected     Previous_stim1   Previous_stim2
1:     2     1      2        1        NaN              NaN
2:     3     2      2        3        NaN              Chosen
3:     3     1      1        3        Rejected         Rejected
4:     2     3      3        2        Chosen           Rejected
5:     1     3      1        3        Chosen           Chosen

任何帮助将不胜感激！

更新

TarJae 提出了一个非常有用的建议，帮助我正确地对我共享的数据框进行了分类。我没有提到它实际上是更大数据框的一部分，并且出于某种原因，这种方法很快就停止正确分类

   stim1 stim2 Chosen Rejected Previous_stim1 Previous_stim2
 1:     2     1      2        1           <NA>           <NA>
 2:     3     2      2        3           <NA>         Chosen
 3:     3     1      1        3       Rejected       Rejected
 4:     2     3      3        2         Chosen       Rejected
 5:     1     3      1        3         Chosen         Chosen
 6:     2     1      1        2         Chosen         Chosen
 7:     2     3      2        3         Chosen         Chosen
 8:     3     1      1        3         Chosen         Chosen
 9:     2     1      2        1         Chosen         Chosen

例如，第 6 行中的 stim1==2。最近，2 被拒绝（第 4 行），但该方法将其归类为已选择。

知道这是怎么回事吗？

再次感谢大家的帮助。

更新 2

非常感谢您的帮助。但是说我也有一个关于“结果”的专栏。

   stim1 stim2 Chosen Rejected outcome Previous_stim 1 Previous_stim 2
1:    15    13     15       13       1            <NA>            <NA>
2:    13    14     14       13       1        Rejected            <NA>
3:    14    15     14       15       1          Chosen          Chosen
4:    14    13     14       13       0          Chosen        Rejected
5:    13    15     13       15       0        Rejected        Rejected
6:    14    15     14       15       1          Chosen        Rejected
7:    15    13     15       13       1        Rejected          Chosen
8:    14    15     14       15       0          Chosen          Chosen

 I want to encode whether it was 
1) most recently  chosen and outcome=1 (can be coded as 1)
2) most recently chosen and outcome=0 (can be coded as 2)
3) most recently rejected and outcome=1 (can be coded as 3)
4) most recently rejected and outcome=0 (can be coded as 4)

是否有一种简单的方法来修改代码以实现这一点？

期望的输出

  stim1 stim2 Chosen Rejected outcome Previous_stim 1 Previous_stim 2 Left_type right_type
1     2     3      2        3       1            <NA>            <NA>       NaN        NaN
2     1     3      3        1       1            <NA>        Rejected       NaN          3
3     2     1      1        2       1          Chosen        Rejected         1          3
4     1     2      1        2       0          Chosen        Rejected         1          3
5     3     1      3        1       1          Chosen          Chosen         1          2

最后跟进

最后，我想添加一个列来检查在之前的试验中选择的刺激（我将其称为最近被拒绝的相关刺激试验）是否与我当前的替代刺激相同

例如如果有

  stim1 stim2 Chosen Rejected     Previous_stim1   Previous_stim2
1:     2     1      2        1        NaN              NaN
2:     3     2      2        3        NaN              Chosen
3:     3     1      1        3        Rejected         Rejected
4:     2     3      3        2        Chosen           Rejected
5:     1     3      1        3        Chosen           Chosen

这是我更新 table

的方式

in trial 3, previous_stim1 (i.e 3) was previously rejectedin favor of 2 (from trial 2) and not in favor of 1 (which is the current alternative) and so Current_alternative_left=0. 

 Similarly, previous_stim2 (i.e 1)was previously 
rejected but that was rejected in favor of 2 (from trial 1) 
and so current_alternative_right=0
    
    On the other hand, in trial 4 stim1=2 
was previously chosen relative to the same

刺激，因为它目前正在对抗 (3) 所以 current_alternative_right=1

期望的输出

stim1 stim2 Chosen Rejected outcome Previous_stim 1 Previous_stim 2 Left_type right_type
1     2     3      2        3       1            <NA>            <NA>       NaN        NaN
2     1     3      3        1       1            <NA>        Rejected       NaN          3
3     2     1      1        2       1          Chosen        Rejected         1          3
4     1     2      1        2       0          Chosen        Rejected         1          3
5     3     1      3        1       1          Chosen          Chosen         1          2

Current_alternative_left    Current_alternative_right
NaN                           NaN
NaN                           0
0                             0 
1                             0
1                             0

我是 data.table 的新手，但我也尝试将 ThomasisCoding 函数复制到 return 这以及

 h <- function(stim, cr) {
            stim_chosen <- rep(NA,length(stim))
            for (k in seq_along(stim)[-1]) {
                  
                  ind <- which(cr[1:(k - 1), , drop = FALSE] == stim[k], arr.ind = TRUE)
                  if (length(ind)) {
                        stim_chosen[k] <- stim[tail(ind,1)[,"row"]]
                        
                  }
            }
            stim_chosen 
      }


setDT(df)[  ,
                      paste0("Chosen_Last", 1:2) := lapply(
                            .(stim1, stim2),
                            h,
                            cr = cbind(Chosen,Rejected)
                      )
                      ]

虽然这并没有给我正确的答案。有人知道我哪里错了吗？

Answer 1

我认为您需要更正上述 table 中的预期结果。但是看起来您正在寻找 lag 动词，当它与 if_else:

一起使用时可以帮助解决这个问题

library(dplyr)

tbl <- tibble(stim1 = c(2,3,3,2,1), stim2 = c(1,2,1,3,3), 
              chosen = c(2,2,1,3,1), rejected = c(1,3,3,2,3))

tbl %>% 
mutate(Previous_stim1 = if_else(lag(tbl$chosen) == lag(stim1), "Chosen", "Rejected")) %>%
mutate(Previous_stim2 = if_else(lag(tbl$chosen) == lag(stim2), "Chosen", "Rejected"))

Answer 2

这是在 ThomasIsCoding 的帮助下生成的解决方案：这里还有其他答案，足以满足您的解决方案！你可以改变和适应哪一个适合你。我选择了ThomasIsCoding提供的第一个

主要任务是检查其他列的所有先前行中的值

library(dplyr)
df %>% 
    mutate(x = replace(rep(NA, length(Chosen)), match(stim1, lag(Chosen)) <= seq_along(stim1), "Chosen"),
           y = replace(rep(NA, length(Rejected)), match(stim1, lag(Rejected)) <= seq_along(stim1), "Rejected"),
           a = replace(rep(NA, length(Chosen)), match(stim2, lag(Chosen)) <= seq_along(stim2), "Chosen"),
           b = replace(rep(NA, length(Rejected)), match(stim2, lag(Rejected)) <= seq_along(stim2), "Rejected"),
           Previous_stim1 = coalesce(x, y),
           Previous_stim2 = coalesce(a, b)) %>% 
    select(stim1, stim2, Chosen, Rejected, Previous_stim1, Previous_stim2)

   stim1 stim2 Chosen Rejected Previous_stim1 Previous_stim2
1:     2     1      2        1           <NA>           <NA>
2:     3     2      2        3           <NA>         Chosen
3:     3     1      1        3       Rejected       Rejected
4:     2     3      3        2         Chosen       Rejected
5:     1     3      1        3         Chosen         Chosen

Answer 3

对于更新 2

setDT(df)[
  ,
  paste0("Previous_stim", 1:2) := lapply(
    .(stim1, stim2),
    f,
    cr = cbind(Chosen, Rejected)
  )
][
  ,
  paste0(c("left", "right"), "type") := lapply(.SD, function(x) 2 * (x == "Rejected") + 2 - outcome),
  .SDcols = patterns("Previous")
][]

给予

   stim1 stim2 Chosen Rejected outcome Previous_stim1 Previous_stim2 lefttype
1:     2     1      2        1       1           <NA>           <NA>       NA
2:     3     2      2        3       1           <NA>         Chosen       NA
3:     3     1      1        3       1       Rejected       Rejected        3
4:     2     3      3        2       0         Chosen       Rejected        2
5:     1     3      1        3       0         Chosen         Chosen        2
6:     2     1      1        2       1       Rejected         Chosen        3
7:     2     3      2        3       1       Rejected       Rejected        3
8:     3     1      1        3       0       Rejected         Chosen        4
9:     2     1      2        1       0         Chosen         Chosen        2
   righttype
1:        NA
2:         1
3:         3
4:         4
5:         2
6:         1
7:         3
8:         2
9:         2

数据

> dput(df)
structure(list(stim1 = c(2L, 3L, 3L, 2L, 1L, 2L, 2L, 3L, 2L),
    stim2 = c(1L, 2L, 1L, 3L, 3L, 1L, 3L, 1L, 1L), Chosen = c(2L,
    2L, 1L, 3L, 1L, 1L, 2L, 1L, 2L), Rejected = c(1L, 3L, 3L,
    2L, 3L, 2L, 3L, 3L, 1L), outcome = c(1, 1, 1, 0, 0, 1, 1,
    0, 0)), class = "data.frame", row.names = c(NA, -9L))

根据您的更新，您可以通过定义自定义函数来尝试以下代码 f

f <- function(stim, cr) {
  res <- rep(NA, length(stim))
  for (k in seq_along(stim)[-1]) {
    ind <- which(cr[1:(k - 1), , drop = FALSE] == stim[k], arr.ind = TRUE)
    if (length(ind)) {
      res[k] <- colnames(cr)[tail(ind[, "col"][order(ind[, "row"])], 1)]
    }
  }
  res
}

setDT(df)[
  ,
  paste("Previous_stim", 1:2) := lapply(
    .(stim1, stim2),
    f,
    cr = cbind(Chosen, Rejected)
  )
][]

你会看到

> setDT(df)[, paste("Previous_stim",1:2) := lapply(.(stim1,stim2),f, cr = cbind(Chosen, Rejected))][]
   stim1 stim2 Chosen Rejected Previous_stim 1 Previous_stim 2
1:     2     1      2        1            <NA>            <NA>
2:     3     2      2        3            <NA>          Chosen
3:     3     1      1        3        Rejected        Rejected
4:     2     3      3        2          Chosen        Rejected
5:     1     3      1        3          Chosen          Chosen
6:     2     1      1        2        Rejected          Chosen
7:     2     3      2        3        Rejected        Rejected
8:     3     1      1        3        Rejected          Chosen
9:     2     1      2        1          Chosen          Chosen

根据先前发生的条件更新

Updating based on Condition of Previous Occurrence

r

dataframe

dplyr

data-wrangling

更新