基于多个列并满足多个条件添加新列（或过滤器）

Question

这是 'problem'

的简化示例

dt <- 
  read.table(textConnection("names age location
Ann  32     Glas
Annie  31       US
Anne  40     Glas
Kerri  31     Edin
David  39      Fra
Glas  29     Annie
Lindsay  24       US
Lynsey  37     Glas
Glas  Annie       US
Lila  39      Fra
Layla  37       US"), 
             header = TRUE, 
             sep = "", 
             stringsAsFactors = FALSE) %>% 
  as_tibble()

我想为 "ann" 和 "glas" 都存在的每一行添加一个名为 AnnGlas 的新列

我知道如何过滤其中一个，但不能同时过滤两者

dt %>% filter_all(any_vars(str_detect(str_to_lower(.), "glas|ann")))

我可以找到一个匹配项，但不能同时找到两个匹配项，使用 apply

apply(dt, 2, function(x) str_detect(str_to_lower(x), "glas|ann"))

我需要找到一些方法来检查任何行列是否包含 glas，如果另一个列包含 ann，以便我可以创建新列

输出看起来像这样

  names   age   location desc 
   <chr>   <chr> <chr>    <lgl>
 1 Ann     32    Glas     TRUE 
 2 Annie   31    US       FALSE
 3 Anne    40    Glas     TRUE 
 4 Kerri   31    Edin     FALSE
 5 David   39    Fra      FALSE
 6 Glas    29    Annie    TRUE 
 7 Lindsay 24    US       FALSE
 8 Lynsey  37    Glas     FALSE
 9 Glas    Annie US       TRUE 
10 Lila    39    Fra      FALSE
11 Layla   37    US       FALSE

Answer 1

因为我们希望这两个术语都存在，所以我们可以使用 lapply 和 Reduce 分别检查它们。如果两列中只存在其中一个，则一起检查模式可能会导致输出为 TRUE。

dt$desc <- Reduce(`|`, lapply(dt, grepl, pattern = "Glas")) & 
            Reduce(`|`, lapply(dt, grepl, pattern = "Ann"))

dt
# A tibble: 11 x 4
#   names   age   location desc 
#   <chr>   <chr> <chr>    <lgl>
# 1 Ann     32    Glas     TRUE 
# 2 Annie   31    US       FALSE
# 3 Anne    40    Glas     TRUE 
# 4 Kerri   31    Edin     FALSE
# 5 David   39    Fra      FALSE
# 6 Glas    29    Annie    TRUE 
# 7 Lindsay 24    US       FALSE
# 8 Lynsey  37    Glas     FALSE
# 9 Glas    Annie US       TRUE 
#10 Lila    39    Fra      FALSE
#11 Layla   37    US       FALSE

基于多个列并满足多个条件添加新列（或过滤器）

Add new column (or filter) based on multiple columns and satisfying multiple conditions

r

multiple-conditions

multiple-columns