R 按条件添加缺失行

R Add Missing Rows by Condition

我有一个包含计数数据的数据集 ('DF1')。它看起来像这样:

Species Date Site n
AMCR 6/1/2021 SVC 14
AMCR 6/1/2021 BMA 1
AMCR 6/7/2021 SVA 2
AMCR 6/15/2021 SVA 9
AMCR 6/21/2021 SVA 18
AMCR 6/29/2021 SVA 18

然而,我的研究实际上有九个 'Sites'(SVC、BMA、SVA、BMC、TMA、TMC、SRA、SRC 和 MCC)并且每个站点都在相同的五个日期(6 /1/2021、6/8/2021、6/15/2021、6/21/2021 和 6/29/2021)。 DF1 仅显示 'n' 中有计数的行,但如果没有计数,我希望数据框为每个站点的每个日期填充零计数,以便它看起来像这样:

Species Date Site n
AMCR 6/1/2021 SVC 14
AMCR 6/7/2021 SVC 0
AMCR 6/15/2021 SVC 0
AMCR 6/21/2021 SVC 0
AMCR 6/29/2021 SVC 0
AMCR 6/1/2021 BMA 1
AMCR 6/7/2021 BMA 0
AMCR 6/15/2021 BMA 0
AMCR 6/21/2021 BMA 0
AMCR 6/29/2021 BMA 0
AMCR 6/1/2021 SVA 0
AMCR 6/7/2021 SVA 2
AMCR 6/15/2021 SVA 9
AMCR 6/21/2021 SVA 18
AMCR 6/29/2021 SVA 18
AMCR 6/1/2021 BMC 0
AMCR 6/7/2021 BMC 0
AMCR 6/15/2021 BMC 0
AMCR 6/21/2021 BMC 0
AMCR 6/29/2021 BMC 0
AMCR 6/1/2021 TMA 0
AMCR 6/7/2021 TMA 0
AMCR 6/15/2021 TMA 0
AMCR 6/21/2021 TMA 0
AMCR 6/29/2021 TMA 0
AMCR 6/1/2021 TMC 0
AMCR 6/7/2021 TMC 0
AMCR 6/15/2021 TMC 0
AMCR 6/21/2021 TMC 0
AMCR 6/29/2021 TMC 0
AMCR 6/1/2021 SRA 0
AMCR 6/7/2021 SRA 0
AMCR 6/15/2021 SRA 0
AMCR 6/21/2021 SRA 0
AMCR 6/29/2021 SRA 0
AMCR 6/1/2021 SRC 0
AMCR 6/7/2021 SRC 0
AMCR 6/15/2021 SRC 0
AMCR 6/21/2021 SRC 0
AMCR 6/29/2021 SRC 0
AMCR 6/1/2021 MCC 0
AMCR 6/7/2021 MCC 0
AMCR 6/15/2021 MCC 0
AMCR 6/21/2021 MCC 0
AMCR 6/29/2021 MCC 0

是否可以通过检查这些日期和网站组合是否不存在来添加计数为 0 的行?

谢谢。

dplyr/tidyr

library(dplyr)
library(tidyr)
dat %>%
  complete(Species, Date, Site, fill = list(n = 0))
# # A tibble: 15 x 4
#    Species Date      Site      n
#    <chr>   <chr>     <chr> <dbl>
#  1 AMCR    6/1/2021  BMA       1
#  2 AMCR    6/1/2021  SVA       0
#  3 AMCR    6/1/2021  SVC      14
#  4 AMCR    6/15/2021 BMA       0
#  5 AMCR    6/15/2021 SVA       9
#  6 AMCR    6/15/2021 SVC       0
#  7 AMCR    6/21/2021 BMA       0
#  8 AMCR    6/21/2021 SVA      18
#  9 AMCR    6/21/2021 SVC       0
# 10 AMCR    6/29/2021 BMA       0
# 11 AMCR    6/29/2021 SVA      18
# 12 AMCR    6/29/2021 SVC       0
# 13 AMCR    6/7/2021  BMA       0
# 14 AMCR    6/7/2021  SVA       2
# 15 AMCR    6/7/2021  SVC       0

基础 R

dat2 <- merge(dat, do.call(expand.grid, lapply(dat[,1:3], unique)), by = names(dat)[1:3], all = TRUE)
dat2
#    Species      Date Site  n
# 1     AMCR  6/1/2021  BMA  1
# 2     AMCR  6/1/2021  SVA NA
# 3     AMCR  6/1/2021  SVC 14
# 4     AMCR 6/15/2021  BMA NA
# 5     AMCR 6/15/2021  SVA  9
# 6     AMCR 6/15/2021  SVC NA
# 7     AMCR 6/21/2021  BMA NA
# 8     AMCR 6/21/2021  SVA 18
# 9     AMCR 6/21/2021  SVC NA
# 10    AMCR 6/29/2021  BMA NA
# 11    AMCR 6/29/2021  SVA 18
# 12    AMCR 6/29/2021  SVC NA
# 13    AMCR  6/7/2021  BMA NA
# 14    AMCR  6/7/2021  SVA  2
# 15    AMCR  6/7/2021  SVC NA
dat2$n <- ifelse(is.na(dat2$n), 0, dat2$n)
dat2
#    Species      Date Site  n
# 1     AMCR  6/1/2021  BMA  1
# 2     AMCR  6/1/2021  SVA  0
# 3     AMCR  6/1/2021  SVC 14
# 4     AMCR 6/15/2021  BMA  0
# 5     AMCR 6/15/2021  SVA  9
# 6     AMCR 6/15/2021  SVC  0
# 7     AMCR 6/21/2021  BMA  0
# 8     AMCR 6/21/2021  SVA 18
# 9     AMCR 6/21/2021  SVC  0
# 10    AMCR 6/29/2021  BMA  0
# 11    AMCR 6/29/2021  SVA 18
# 12    AMCR 6/29/2021  SVC  0
# 13    AMCR  6/7/2021  BMA  0
# 14    AMCR  6/7/2021  SVA  2
# 15    AMCR  6/7/2021  SVC  0

数据

dat <- structure(list(Species = c("AMCR", "AMCR", "AMCR", "AMCR", "AMCR", "AMCR"), Date = c("6/1/2021", "6/1/2021", "6/7/2021", "6/15/2021", "6/21/2021", "6/29/2021"), Site = c("SVC", "BMA", "SVA", "SVA", "SVA", "SVA"), n = c(14L, 1L, 2L, 9L, 18L, 18L)), class = "data.frame", row.names = c(NA, -6L))