为什么日期格式变成double

Why does date format change to double

这个问题与

相关

数据集是:

mydata = data.frame (Id =c(1,1,1,1,1,1,1,1,1,1),
                     Date = c("2000-01-01","2000-01-05","2000-02-02", "2000-02-12", 
                              "2000-02-14","2000-05-13", "2000-05-15", "2000-05-17", 
                              "2000-05-16", "2000-05-20"),
                     drug = c("A","A","B","B","B","A","A","A","C","C"))

   Id       Date drug
1   1 2000-01-01    A
2   1 2000-01-05    A
3   1 2000-02-02    B
4   1 2000-02-12    B
5   1 2000-02-14    B
6   1 2000-05-13    A
7   1 2000-05-15    A
8   1 2000-05-17    A
9   1 2000-05-16    C
10  1 2000-05-20    C

使用此代码:

library(lubridate)
library(dplyr)

mydata %>% 
  group_by(Id, drug) %>% 
  mutate(Date = ymd(Date),
         Diff = as.numeric(Date - lag(Date, default = Date[1])),
         startDate = min(Date, na.rm = T),
         endDate = max(Date, na.rm = T),
         startDate =  ifelse(Diff > 100, Date, startdate)
         )

      Id Date       drug   Diff startDate endDate   
   <dbl> <date>     <chr> <dbl>     <dbl> <date>    
 1     1 2000-01-01 A         0     17257 2000-05-17
 2     1 2000-01-05 A         4     17257 2000-05-17
 3     1 2000-02-02 B         0     17257 2000-02-14
 4     1 2000-02-12 B        10     17257 2000-02-14
 5     1 2000-02-14 B         2     17257 2000-02-14
 6     1 2000-05-13 A       129     11090 2000-05-17
 7     1 2000-05-15 A         2     17257 2000-05-17
 8     1 2000-05-17 A         2     17257 2000-05-17
 9     1 2000-05-16 C         0     17257 2000-05-20
10     1 2000-05-20 C         4     17257 2000-05-20

startDate 列在 class 的最后一行从 date 更改为 double,我不明白为什么。

我试过origin= "1970-01-01as.Dateymd ...

所以我的问题是为什么会发生这种情况?

ifelse() 将 class 从 date 更改为 double 的原因记录在 help("ifelse"):

The mode of the result may depend on the value of test (see the examples), and the class attribute (see oldClass) of the result is taken from test and may be inappropriate for the values selected from yes and no.

也许dplyr::if_else()在这里更合适:

mydata %>% 
  group_by(Id, drug) %>% 
  mutate(Date = lubridate::ymd(Date),
         Diff = as.numeric(Date - lag(Date, default = Date[1])),
         startDate = min(Date, na.rm = T),
         endDate = max(Date, na.rm = T),
         startDate =  if_else(Diff > 100, Date, startDate)
  )

returns

# A tibble: 10 × 6
# Groups:   Id, drug [3]
      Id Date       drug   Diff startDate  endDate   
   <dbl> <date>     <fct> <dbl> <date>     <date>    
 1     1 2000-01-01 A         0 2000-01-01 2000-05-17
 2     1 2000-01-05 A         4 2000-01-01 2000-05-17
 3     1 2000-02-02 B         0 2000-02-02 2000-02-14
 4     1 2000-02-12 B        10 2000-02-02 2000-02-14
 5     1 2000-02-14 B         2 2000-02-02 2000-02-14
 6     1 2000-05-13 A       129 2000-05-13 2000-05-17
 7     1 2000-05-15 A         2 2000-01-01 2000-05-17
 8     1 2000-05-17 A         2 2000-01-01 2000-05-17
 9     1 2000-05-16 C         0 2000-05-16 2000-05-20
10     1 2000-05-20 C         4 2000-05-16 2000-05-20