如何根据 R 中的多个条件返回 true 或 false?

How to give back true or false based on multiple criteria in R?

我有一个 data.table,其中包含每个人的进入和退出日期以及指示退出原因的文本列。我的数据如下所示:

dt <- data.table (ID = c(1,2,3,4,5),
                  entry = c("01/01/2010", "01/02/2016", "01/05/2010", "01/09/2013", "01/01/2010"),
                  exit = c("31/12/2010", "01/01/2021", "30/09/2010", "31/12/2015", "30/09/2010"),
                  text = c("a", NA, "c", NA, "b"),
                  result_2010 = c(NA, NA, NA, NA,NA))

   ID    entry      exit     text    result_2010
1:  1 01/01/2010 31/12/2010    a          NA
2:  2 01/02/2016 01/01/2021 <NA>          NA
3:  3 01/05/2010 30/09/2010    c          NA
4:  4 01/09/2013 31/12/2015 <NA>          NA
5:  5 01/01/2010 30/09/2010    b          NA

在“result_2010”栏中,我想确定此人是否在 2010 年 1 月 1 日至 2010 年 12 月 31 日之间离开公司,但前提是在“文本”栏中此人有“a " 或 "c"。否则结果应该 return "false".

结果应如下所示:

    ID   entry       exit   text    result_2010
1:  1 01/01/2010 31/12/2010    a        TRUE
2:  2 01/02/2016 01/01/2021 <NA>       FALSE
3:  3 01/05/2010 30/09/2010    c        TRUE
4:  4 01/09/2013 31/12/2015 <NA>       FALSE
5:  5 01/01/2010 30/09/2010    b       FALSE

有人知道我该怎么做吗?

我们可以将列转换为Dateclass,并根据OP的post

中的条件创建一个逻辑列
library(dplyr)
library(lubridate)
dt %>% 
   mutate(across(c(entry, exit), dmy)) %>% 
   mutate(result_2010 = entry >= as.Date('2010-01-01') & 
     exit <= as.Date("2010-12-31") & text %in% c("a", "c"))

-输出

 ID      entry       exit text result_2010
1:  1 2010-01-01 2010-12-31    a        TRUE
2:  2 2016-02-01 2021-01-01 <NA>       FALSE
3:  3 2010-05-01 2010-09-30    c        TRUE
4:  4 2013-09-01 2015-12-31 <NA>       FALSE
5:  5 2010-01-01 2010-09-30    b       FALSE

data.table

dt[, c("entry","exit") := lapply(.SD, as.Date, format = "%d/%m/%Y"), .SDcols = c("entry","exit")]
dt[, result_2010 := text %in% c("a", "c") & between(exit, as.Date("2010-01-01"), as.Date("2010-12-31"))]
#       ID      entry       exit   text result_2010
#    <num>     <Date>     <Date> <char>      <lgcl>
# 1:     1 2010-01-01 2010-12-31      a        TRUE
# 2:     2 2016-02-01 2021-01-01   <NA>       FALSE
# 3:     3 2010-05-01 2010-09-30      c        TRUE
# 4:     4 2013-09-01 2015-12-31   <NA>       FALSE
# 5:     5 2010-01-01 2010-09-30      b       FALSE

(实际上是 data.table 版本,两者都受益于 data.table::betweendplyr::between 的可读性。)