R 中的完整函数

Complete Funcion in R

我想在 R 中完成一个 df,但它错过了一个月的日期,例如,如果我有一年的按月和日期的信息,就像这样。

df = data.frame(Date = c("2020-01-01","2020-02-01",
"2020-03-02","2020-04-01","2020-09-01","2020-10-01",
"2020-11-01","2020-12-01"))

当我使用功能完成时,我是这样使用的

 df = df%>%
  mutate(Date = as.Date(Date)) %>%
  complete(Date= seq.Date("2020-01-01", "2020-12-31", by = "month"))

问题是我的最终 df 完成了所有日期,例如 5 月、6 月、7 月,这没问题,但也完成了 3 月,因为 3 月没有第一天,开始于 2020-03-02。

 df = data.frame(Date = c("2020-01-01","2020-02-01",
"2020-03-01","2020-03-02","2020-04-01","2020-05-01",
"2020-06-01","2020-07-01","2020-08-01","2020-09-01",
"2020-10-01","2020-11-01","2020-12-01"))

你知道只有当 df 没有任何日期时如何完成 df 吗?

就我而言,我不想完成三月,因为三月已经有日期了。

非常感谢。

一个可能的解决方案是仅通过 zoo 包中的 yearmon 完成,因此与实际的月几无关。

library(dplyr) 
library(zoo) # for as.yearmon
library(tidyr) # for complete

df <- data.frame(Date = c("2020-01-01","2020-02-01",
                          "2020-03-02","2020-04-01",
                          "2020-09-01","2020-10-01",
                          "2020-11-01","2020-12-01"), 
                 id = 1:8)
df
#>         Date id
#> 1 2020-01-01  1
#> 2 2020-02-01  2
#> 3 2020-03-02  3
#> 4 2020-04-01  4
#> 5 2020-09-01  5
#> 6 2020-10-01  6
#> 7 2020-11-01  7
#> 8 2020-12-01  8

df %>% 
  mutate(Date = as.Date(Date), 
         year_mon = as.yearmon(Date)) %>%
  complete(
    year_mon = seq.Date(as.Date("2020-01-01"),
                    as.Date("2020-12-31"),
                    by = "month") %>% as.yearmon()
  )
#> # A tibble: 12 x 3
#>    year_mon  Date          id
#>    <yearmon> <date>     <int>
#>  1 Jan 2020  2020-01-01     1
#>  2 Feb 2020  2020-02-01     2
#>  3 Mar 2020  2020-03-02     3
#>  4 Apr 2020  2020-04-01     4
#>  5 May 2020  NA            NA
#>  6 Jun 2020  NA            NA
#>  7 Jul 2020  NA            NA
#>  8 Aug 2020  NA            NA
#>  9 Sep 2020  2020-09-01     5
#> 10 Oct 2020  2020-10-01     6
#> 11 Nov 2020  2020-11-01     7
#> 12 Dec 2020  2020-12-01     8

reprex package (v2.0.0)

于 2021-06-25 创建

您可以从日期中提取年份和月份值,并在其上使用 complete

library(dplyr)
library(lubridate)
library(tidyr)

df %>% 
  mutate(Date = as.Date(Date), 
         year = year(Date), 
         month = month(Date)) %>%
  complete(year, month = 1:12) %>%
  mutate(Date = if_else(is.na(Date), 
                        as.Date(paste(year, month, 1, sep = '-')), Date)) %>%
  select(Date)

#    Date      
#   <date>    
# 1 2020-01-01
# 2 2020-02-01
# 3 2020-03-02
# 4 2020-04-01
# 5 2020-05-01
# 6 2020-06-01
# 7 2020-07-01
# 8 2020-08-01
# 9 2020-09-01
#10 2020-10-01
#11 2020-11-01
#12 2020-12-01