如何在 R 中减去两个 DATES 变量,结果应该以天为单位
How to substract two DATES variables in R and outcome should be in days
我的数据框中有以下两列,名为 Entry_date
,Death_date
,包含格式为 YYYY/MM/DD
的日期。我想减去 (Death_date-Entry_date = survival_days)
。从 Entry_date
中减去 Death_date
后,我想要几天后的结果。我的数据如下所示。
Sample_ID<-c("a1","a2","a3","a4","a5","a6")
Entry_date<-c(2010/04/13, 2008/07/30, 2009/03/06, 2008/08/22, 2009/06/24, 2008/08/26)
Death_date<-c(2007/05/17, 2007/05/16, 2007/05/16, 2007/05/16,2007/05/16, 2010/05/16)
Df<-data.frame(Sample_ID,Entry_date,Death_date)
我想要一个名为 Df$survival_days 的列作为结果变量,如下所示
Sample_ID Entry_date Death_date Df$survival_days
-1062.00
-441.00
-660.00
-464.00
-770.00
468.00
我如何在 R 中执行此操作。我的 cox 需要这个变量。回归生存分析。我的真实数据框有大约 10,000 个观察值。
您可以使用单位为“天”的 difftime()
使用带有适当单位的 difftime
并以字符串形式提供日期:
Sample_ID<-c("a1","a2","a3","a4","a5","a6")
Entry_date<-c("2010/04/13", "2008/07/30", "2009/03/06", "2008/08/22", "2009/06/24", "2008/08/26")
Death_date<-c("2007/05/17", "2007/05/16", "2007/05/16", "2007/05/16","2007/05/16", "2010/05/16")
Df<-data.frame(Sample_ID,Entry_date,Death_date)
Df$difference_in_days <- difftime(Df$Death_date, Df$Entry_date, units = "days")
输出
> Df
Sample_ID Entry_date Death_date difference_in_days
1 a1 2010/04/13 2007/05/17 -1062.0000 days
2 a2 2008/07/30 2007/05/16 -441.0000 days
3 a3 2009/03/06 2007/05/16 -660.0417 days
4 a4 2008/08/22 2007/05/16 -464.0000 days
5 a5 2009/06/24 2007/05/16 -770.0000 days
6 a6 2008/08/26 2010/05/16 628.0000 days
您可以使用 lubridate
和 dplyr
。但首先:我更改了您的输入数据:
Sample_ID <- c("a1","a2","a3","a4","a5","a6")
Entry_date <- c("2010/04/13", "2008/07/30", "2009/03/06", "2008/08/22", "2009/06/24", "2008/08/26")
Death_date <- c("2007/05/17", "2007/05/16", "2007/05/16", "2007/05/16","2007/05/16", "2010/05/16")
Df <- data.frame(Sample_ID,Entry_date=ymd(Entry_date),Death_date=ymd(Death_date), stringsAsFactors = FALSE)
有了这个数据
Df %>%
mutate(survival_days=Death_date - Entry_date)
产量
Sample_ID Entry_date Death_date survival_days
1 a1 2010-04-13 2007-05-17 -1062 days
2 a2 2008-07-30 2007-05-16 -441 days
3 a3 2009-03-06 2007-05-16 -660 days
4 a4 2008-08-22 2007-05-16 -464 days
5 a5 2009-06-24 2007-05-16 -770 days
6 a6 2008-08-26 2010-05-16 628 days
我的数据框中有以下两列,名为 Entry_date
,Death_date
,包含格式为 YYYY/MM/DD
的日期。我想减去 (Death_date-Entry_date = survival_days)
。从 Entry_date
中减去 Death_date
后,我想要几天后的结果。我的数据如下所示。
Sample_ID<-c("a1","a2","a3","a4","a5","a6")
Entry_date<-c(2010/04/13, 2008/07/30, 2009/03/06, 2008/08/22, 2009/06/24, 2008/08/26)
Death_date<-c(2007/05/17, 2007/05/16, 2007/05/16, 2007/05/16,2007/05/16, 2010/05/16)
Df<-data.frame(Sample_ID,Entry_date,Death_date)
我想要一个名为 Df$survival_days 的列作为结果变量,如下所示
Sample_ID Entry_date Death_date Df$survival_days
-1062.00
-441.00
-660.00
-464.00
-770.00
468.00
我如何在 R 中执行此操作。我的 cox 需要这个变量。回归生存分析。我的真实数据框有大约 10,000 个观察值。
您可以使用单位为“天”的 difftime()
使用带有适当单位的 difftime
并以字符串形式提供日期:
Sample_ID<-c("a1","a2","a3","a4","a5","a6")
Entry_date<-c("2010/04/13", "2008/07/30", "2009/03/06", "2008/08/22", "2009/06/24", "2008/08/26")
Death_date<-c("2007/05/17", "2007/05/16", "2007/05/16", "2007/05/16","2007/05/16", "2010/05/16")
Df<-data.frame(Sample_ID,Entry_date,Death_date)
Df$difference_in_days <- difftime(Df$Death_date, Df$Entry_date, units = "days")
输出
> Df
Sample_ID Entry_date Death_date difference_in_days
1 a1 2010/04/13 2007/05/17 -1062.0000 days
2 a2 2008/07/30 2007/05/16 -441.0000 days
3 a3 2009/03/06 2007/05/16 -660.0417 days
4 a4 2008/08/22 2007/05/16 -464.0000 days
5 a5 2009/06/24 2007/05/16 -770.0000 days
6 a6 2008/08/26 2010/05/16 628.0000 days
您可以使用 lubridate
和 dplyr
。但首先:我更改了您的输入数据:
Sample_ID <- c("a1","a2","a3","a4","a5","a6")
Entry_date <- c("2010/04/13", "2008/07/30", "2009/03/06", "2008/08/22", "2009/06/24", "2008/08/26")
Death_date <- c("2007/05/17", "2007/05/16", "2007/05/16", "2007/05/16","2007/05/16", "2010/05/16")
Df <- data.frame(Sample_ID,Entry_date=ymd(Entry_date),Death_date=ymd(Death_date), stringsAsFactors = FALSE)
有了这个数据
Df %>%
mutate(survival_days=Death_date - Entry_date)
产量
Sample_ID Entry_date Death_date survival_days
1 a1 2010-04-13 2007-05-17 -1062 days
2 a2 2008-07-30 2007-05-16 -441 days
3 a3 2009-03-06 2007-05-16 -660 days
4 a4 2008-08-22 2007-05-16 -464 days
5 a5 2009-06-24 2007-05-16 -770 days
6 a6 2008-08-26 2010-05-16 628 days