在 R 中，有条件地为 ID 的所有行改变一个新列

Question

背景

我有这个数据框 df:

df <- data.frame(ID =    c("a","a","a","b", "c","c","c","c"),
                 event = c("red","black","blue","white", "orange","red","gray","green"),
                 stringsAsFactors=FALSE)

里面有一些人 (ID) 和 event 的描述。我想创建一个新变量 condition，根据给定 ID 的任何单元格是否包含“红色”或“蓝色”来指示 1 或 0。

问题

我可以得到这项工作，但仅限于匹配的行。我想要的是，如果 any 一个人的细胞在 event 中包含“红色”或“蓝色”，则 all 他们的condition 中的单元格应标记为 1。换句话说，我想要这样：

ID  event condition
 a    red         1
 a  black         1
 a   blue         1
 b  white         0
 c orange         1
 c    red         1
 c   gray         1
 c  green         1

我试过的

到目前为止，我使用这段代码得到了这个结果：

df <- df %>%
mutate(condition = ifelse(df$event %in% c("red","blue"), 1, 0))

ID  event condition
 a    red         1
 a  black         0
 a   blue         1
 b  white         0
 c orange         0
 c    red         1
 c   gray         0
 c  green         0

换句话说，匹配的行被标记为 1，但我希望具有任何匹配行的 ID 的所有行都被标记为 1.

Answer 1

我们需要 any 包裹来自 %in% 的逻辑向量 - 此外参数可以反转（在 OPs 代码中，它是 return 1 它匹配元素'red' 或 'blue'，其余为 0。

library(dplyr)
df %>% 
   group_by(ID) %>% 
   mutate(condition = +(any(c('red', 'blue') %in% event))) %>%
   ungroup

-输出

# A tibble: 8 × 3
  ID    event  condition
  <chr> <chr>      <int>
1 a     red            1
2 a     black          1
3 a     blue           1
4 b     white          0
5 c     orange         1
6 c     red            1
7 c     gray           1
8 c     green          1

Answer 2

这是另一种方法：

library(dplyr)
library(stringr)

df %>% 
  group_by(ID) %>% 
  mutate(condition = if_else(str_detect(event, paste(c("red", "blue"), collapse = "|")), 1, 0))

  ID    event  condition
  <chr> <chr>      <dbl>
1 a     red            1
2 a     black          0
3 a     blue           1
4 b     white          0
5 c     orange         0
6 c     red            1
7 c     gray           0
8 c     green          0

在 R 中，有条件地为 ID 的所有行改变一个新列

In R, conditionally mutate a new column for all of an ID's rows

r

dplyr