如何重新编码新的日期变量和 select R 中四个日期列中的最低日期
How to recode a new date variable and select the lowest date out of four date columns in R
示例数据
stack_dat <- structure(list(bio_drug_stop_date = structure(c(15376, NA, 15602, NA, 15550, NA, 15350, 15363, 15418, 16157), class = "Date"),
follow_up_2_years = structure(c(16047, 14318, 16038, 14352, 16044, 16582, 16054, 16048, 16054, 16054), class = "Date"),
date_of_last_visit = structure(c(17836, 16405, 17591, 16801, 17866, 15826, 17866, 17257, 18109, 16587), class = "Date"),
end_of_follow_up_date = structure(c(NA, 17928, NA, 17928, 17900, 16980, 16890, 17100, NA, NA), class = "Date"), data_cut_date = structure(c(18201,
18201, 18201, 18201, 18201, 18201, 18201, 18201, 18201, 18201), class = "Date")), row.names = c(NA, 10L), class = "data.frame")
结构
'data.frame': 10 obs. of 5 variables:
$ bio_drug_stop_date : Date, format: "2012-02-06" NA "2012-09-19" NA ...
$ follow_up_2_years : Date, format: "2013-12-08" "2009-03-15" "2013-11-29" "2009-04-18" ...
$ date_of_last_visit : Date, format: "2018-11-01" "2014-12-01" "2018-03-01" "2016-01-01" ...
$ end_of_follow_up_date: Date, format: NA "2019-02-01" NA "2019-02-01" ...
$ data_cut_date : Date, format: "2019-11-01" "2019-11-01" "2019-11-01" "2019-11-01" ...
瞄准
目标是重新编码一个名为 treatment_end
的新变量,该变量是根据 bio_drug_stop_date
的日期计算的;如果不存在,则为其他四个列中任何一个的最低日期:follow_up_2_years
、date_of_last_visit
、end_of_follow_up_date
、data_cut_date
我们可以使用 pmin
和 coalesce
- coalesce
'bio_drug_stop_date' 和 min
(使用 pmin
)来自其他列的日期每行
library(dplyr)
library(purrr)
stack_dat %>%
mutate(treatment_end = coalesce(bio_drug_stop_date,
invoke(pmin, across(-1), na.rm = TRUE)))
-输出
bio_drug_stop_date follow_up_2_years date_of_last_visit end_of_follow_up_date data_cut_date treatment_end
1 2012-02-06 2013-12-08 2018-11-01 <NA> 2019-11-01 2012-02-06
2 <NA> 2009-03-15 2014-12-01 2019-02-01 2019-11-01 2009-03-15
3 2012-09-19 2013-11-29 2018-03-01 <NA> 2019-11-01 2012-09-19
4 <NA> 2009-04-18 2016-01-01 2019-02-01 2019-11-01 2009-04-18
5 2012-07-29 2013-12-05 2018-12-01 2019-01-04 2019-11-01 2012-07-29
6 <NA> 2015-05-27 2013-05-01 2016-06-28 2019-11-01 2013-05-01
7 2012-01-11 2013-12-15 2018-12-01 2016-03-30 2019-11-01 2012-01-11
8 2012-01-24 2013-12-09 2017-04-01 2016-10-26 2019-11-01 2012-01-24
9 2012-03-19 2013-12-15 2019-08-01 <NA> 2019-11-01 2012-03-19
10 2014-03-28 2013-12-15 2015-06-01 <NA> 2019-11-01 2014-03-28
示例数据
stack_dat <- structure(list(bio_drug_stop_date = structure(c(15376, NA, 15602, NA, 15550, NA, 15350, 15363, 15418, 16157), class = "Date"),
follow_up_2_years = structure(c(16047, 14318, 16038, 14352, 16044, 16582, 16054, 16048, 16054, 16054), class = "Date"),
date_of_last_visit = structure(c(17836, 16405, 17591, 16801, 17866, 15826, 17866, 17257, 18109, 16587), class = "Date"),
end_of_follow_up_date = structure(c(NA, 17928, NA, 17928, 17900, 16980, 16890, 17100, NA, NA), class = "Date"), data_cut_date = structure(c(18201,
18201, 18201, 18201, 18201, 18201, 18201, 18201, 18201, 18201), class = "Date")), row.names = c(NA, 10L), class = "data.frame")
结构
'data.frame': 10 obs. of 5 variables:
$ bio_drug_stop_date : Date, format: "2012-02-06" NA "2012-09-19" NA ...
$ follow_up_2_years : Date, format: "2013-12-08" "2009-03-15" "2013-11-29" "2009-04-18" ...
$ date_of_last_visit : Date, format: "2018-11-01" "2014-12-01" "2018-03-01" "2016-01-01" ...
$ end_of_follow_up_date: Date, format: NA "2019-02-01" NA "2019-02-01" ...
$ data_cut_date : Date, format: "2019-11-01" "2019-11-01" "2019-11-01" "2019-11-01" ...
瞄准
目标是重新编码一个名为 treatment_end
的新变量,该变量是根据 bio_drug_stop_date
的日期计算的;如果不存在,则为其他四个列中任何一个的最低日期:follow_up_2_years
、date_of_last_visit
、end_of_follow_up_date
、data_cut_date
我们可以使用 pmin
和 coalesce
- coalesce
'bio_drug_stop_date' 和 min
(使用 pmin
)来自其他列的日期每行
library(dplyr)
library(purrr)
stack_dat %>%
mutate(treatment_end = coalesce(bio_drug_stop_date,
invoke(pmin, across(-1), na.rm = TRUE)))
-输出
bio_drug_stop_date follow_up_2_years date_of_last_visit end_of_follow_up_date data_cut_date treatment_end
1 2012-02-06 2013-12-08 2018-11-01 <NA> 2019-11-01 2012-02-06
2 <NA> 2009-03-15 2014-12-01 2019-02-01 2019-11-01 2009-03-15
3 2012-09-19 2013-11-29 2018-03-01 <NA> 2019-11-01 2012-09-19
4 <NA> 2009-04-18 2016-01-01 2019-02-01 2019-11-01 2009-04-18
5 2012-07-29 2013-12-05 2018-12-01 2019-01-04 2019-11-01 2012-07-29
6 <NA> 2015-05-27 2013-05-01 2016-06-28 2019-11-01 2013-05-01
7 2012-01-11 2013-12-15 2018-12-01 2016-03-30 2019-11-01 2012-01-11
8 2012-01-24 2013-12-09 2017-04-01 2016-10-26 2019-11-01 2012-01-24
9 2012-03-19 2013-12-15 2019-08-01 <NA> 2019-11-01 2012-03-19
10 2014-03-28 2013-12-15 2015-06-01 <NA> 2019-11-01 2014-03-28