减去日期对象?

Subtract date objects?

所以我试图简单地从 child_date 中减去 survey_date,但不断出现 "character string is not in a standard unambiguous format" 错误。两列都是字符格式,那有什么问题呢?

这行不通:

df %>% mutate(child_age = survey_date-child_date)

structure(list(case_id = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), person_id = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2), household_id = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6), year = c(2018, 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2018), month = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1), survey_date_cmc = c(1417, 1417, 1417, 1417, 1417, 1417, 1417, 1417, 1417, 1417, 1417, 1417, 1417, 1417, 1417, 1417, 1417, 1417, 1417, 1417), mom_age = c(28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 37, 37, 37, 37, 37, 37, 37, 37, 37, 37), mom_dob_cmc = c(1081, 1081, 1081, 1081, 1081, 1081, 1081, 1081, 1081, 1081, 973, 973, 973, 973, 973, 973, 973, 973, 973, 973), name = c("b3_01", "b3_02", "b3_03", "b3_04", "b3_05", "b3_06", "b3_07", "b3_08", "b3_09", "b3_10", "b3_01", "b3_02", "b3_03", "b3_04", "b3_05", "b3_06", "b3_07", "b3_08", "b3_09", "b3_10"), value = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 1297, 1297, NA, NA, NA, NA, NA, NA, NA, NA), child_date = c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "2008-01-01", "2008-01-01", NA, NA, NA, NA, NA, NA, NA, NA), survey_date = c("2018-01-01", "2018-01-01", "2018-01-01", "2018-01-01", "2018-01-01", "2018-01-01", "2018-01-01", "2018-01-01", "2018-01-01", "2018-01-01", "2018-01-01", "2018-01-01", "2018-01-01", "2018-01-01", "2018-01-01", "2018-01-01", "2018-01-01", "2018-01-01", "2018-01-01", "2018-01-01")), class = c("grouped_df", "tbl_df", "tbl", "data.frame"), row.names = c(NA, -20L), groups = structure(list( mom_age = c(28, 37), case_id = 1:2, .rows = list(1:10, 11:20)), row.names = c(NA, -2L), class = c("tbl_df", "tbl", "data.frame"), .drop = TRUE))

列为 character class。需要转换

library(dplyr)
df %>% 
   mutate(child_age = as.Date(survey_date) - as.Date(child_date))

为了更好地控制 units,可以使用 difftime

df %>%
   mutate(child_age = difftime(as.Date(child_date), as.Date(survey_date), unit = 'weeks'))

或使用 lubridate

中的 interval
lubridate)
df %>% 
     mutate(child_age = interval( as.Date(child_date), as.Date(survey_date))/years(1))