如何在 R 的新列中粘贴满足条件的相同值

Question

编辑：不小心在代码块的行上定义了刺激，所以它没有出现。

我有一个问题，我不确定如何解决。任何帮助将不胜感激。基本上，我有一些数据，我想以一种方式重新格式化，即从单个单元格中获取满足特定条件的值，并针对给定条件的每个实例重复它。

下面我有一个重新创建的问题玩具示例：

stimulus <- c("instructions","happy", "sad", "anger", "instructions", "happy", "sad", "anger", "instructions", "happy", "sad", "anger")
test_part <- c("comprehension", "emotion", "emotion", "emotion", "comprehension", "emotion", "emotion", "emotion", "comprehension", "emotion", "emotion", "emotion")
answer <- c(0, 1,3,5,0, 7,2,1, 0, 1,7,2)
condition <- c(NA, "angry", "angry", "angry", NA, "happy", "happy", "happy", NA, "sad", "sad", "sad")
id <-c(1,1,1,1,1,1,1,1,1,1,1,1)
mydata<-data.frame(id, condition, test_part, stimulus, answer)
head(mydata)

这里是数据的设置：

> head(mydata)
  id condition     test_part     stimulus answer
1  1      <NA> comprehension instructions      0
2  1     angry       emotion        happy      1
3  1     angry       emotion          sad      3
4  1     angry       emotion        anger      5
5  1      <NA> comprehension instructions      0
6  1     happy       emotion        happy      7

我想为每个情绪“test_part”行创建一个新列，重复与条件匹配的刺激答案。因此，例如，我首先想要一个名为 AngryRating 的新变量，它从包含愤怒的行中获取“答案”值，并为该条件（愤怒）的每一行重复该值，然后重复下一个“答案” " 下一个条件（快乐）的愤怒行的值等等

这是我试图将正确的值放入新 column/variable 中的方法：

mydata$AngryRating <- ifelse(mydata$stimulus == "anger" & mydata$test_part == "emotion",
                                paste0(mydata$answer), NA)
head(mydata)

> head(mydata)
  id condition     test_part     stimulus answer AngryRating
1  1      <NA> comprehension instructions      0        <NA>
2  1     angry       emotion        happy      1        <NA>
3  1     angry       emotion          sad      3        <NA>
4  1     angry       emotion        anger      5           5
5  1      <NA> comprehension instructions      0        <NA>
6  1     happy       emotion        happy      7        <NA>

但在这里，我只得到条件 == 愤怒、test_part == 情绪和刺激 == 愤怒的一个值。但我希望为条件 == 愤怒的所有行粘贴 5 值（以及条件 == 快乐的所有行的下一个愤怒值）。

像这样：

answerFormatted <-c(NA,5,5,5,NA,1,1,1, NA,2,2,2)
mydesireddata<-data.frame(id, condition, test_part, stimulus, answer, answerFormatted)
head(mydesireddata)

> head(mydesireddata)
  id condition     test_part     stimulus answer answerFormatted
1  1      <NA> comprehension instructions      0              NA
2  1     angry       emotion        happy      1               5
3  1     angry       emotion          sad      3               5
4  1     angry       emotion        anger      5               5
5  1      <NA> comprehension instructions      0              NA
6  1     happy       emotion        happy      7               1

可能是我需要循环或其他东西的情况，但我只是不确定如何在正确的条件下以一种完全有效的方式重复该值。再一次，我们将不胜感激！！

Answer 1

我不确定这是否是您要查找的内容。您可以使用

library(dplyr)
library(stringr)
library(tidyr)

mydata %>% 
  group_by(id, grp = cumsum(is.na(condition)), grp2 = is.na(condition)) %>% 
  mutate(
    AngryRating = ifelse(
      str_extract(condition, "^.{3}") == str_extract(stimulus, "^.{3}"), 
      answer, NA_real_)) %>% 
  fill(AngryRating, .direction = "updown") %>% 
  ungroup() %>% 
  select(-grp, grp2)

获得

# A tibble: 12 × 7
      id condition test_part     stimulus     answer grp2  AngryRating
   <dbl> <chr>     <chr>         <chr>         <dbl> <lgl>       <dbl>
 1     1 NA        comprehension instructions      0 TRUE           NA
 2     1 angry     emotion       happy             1 FALSE           5
 3     1 angry     emotion       sad               3 FALSE           5
 4     1 angry     emotion       anger             5 FALSE           5
 5     1 NA        comprehension instructions      0 TRUE           NA
 6     1 happy     emotion       happy             7 FALSE           7
 7     1 happy     emotion       sad               2 FALSE           7
 8     1 happy     emotion       anger             1 FALSE           7
 9     1 NA        comprehension instructions      0 TRUE           NA
10     1 sad       emotion       happy             1 FALSE           7
11     1 sad       emotion       sad               7 FALSE           7
12     1 sad       emotion       anger             2 FALSE           7

这里发生了什么？

首先我们将数据按 id 分组（基于假设，不同的 id 不应混合）和具有连续 test_parts 的数据块（这些是按条件分隔 NAs).
如果条件和刺激匹配，我们从 answer 列中提取值。由于 angry 和 anger 不一样，我们只比较前三个字母。
最后，我们将这个答案 (AngryRating) 应用于同一块中的所有条目（如上定义）。

如何在 R 的新列中粘贴满足条件的相同值

How to paste same value that meets conditions in new column in R

r

data-wrangling