R 按条件添加缺失行
R Add Missing Rows by Condition
我有一个包含计数数据的数据集 ('DF1')。它看起来像这样:
Species
Date
Site
n
AMCR
6/1/2021
SVC
14
AMCR
6/1/2021
BMA
1
AMCR
6/7/2021
SVA
2
AMCR
6/15/2021
SVA
9
AMCR
6/21/2021
SVA
18
AMCR
6/29/2021
SVA
18
然而,我的研究实际上有九个 'Sites'(SVC、BMA、SVA、BMC、TMA、TMC、SRA、SRC 和 MCC)并且每个站点都在相同的五个日期(6 /1/2021、6/8/2021、6/15/2021、6/21/2021 和 6/29/2021)。 DF1 仅显示 'n' 中有计数的行,但如果没有计数,我希望数据框为每个站点的每个日期填充零计数,以便它看起来像这样:
Species
Date
Site
n
AMCR
6/1/2021
SVC
14
AMCR
6/7/2021
SVC
0
AMCR
6/15/2021
SVC
0
AMCR
6/21/2021
SVC
0
AMCR
6/29/2021
SVC
0
AMCR
6/1/2021
BMA
1
AMCR
6/7/2021
BMA
0
AMCR
6/15/2021
BMA
0
AMCR
6/21/2021
BMA
0
AMCR
6/29/2021
BMA
0
AMCR
6/1/2021
SVA
0
AMCR
6/7/2021
SVA
2
AMCR
6/15/2021
SVA
9
AMCR
6/21/2021
SVA
18
AMCR
6/29/2021
SVA
18
AMCR
6/1/2021
BMC
0
AMCR
6/7/2021
BMC
0
AMCR
6/15/2021
BMC
0
AMCR
6/21/2021
BMC
0
AMCR
6/29/2021
BMC
0
AMCR
6/1/2021
TMA
0
AMCR
6/7/2021
TMA
0
AMCR
6/15/2021
TMA
0
AMCR
6/21/2021
TMA
0
AMCR
6/29/2021
TMA
0
AMCR
6/1/2021
TMC
0
AMCR
6/7/2021
TMC
0
AMCR
6/15/2021
TMC
0
AMCR
6/21/2021
TMC
0
AMCR
6/29/2021
TMC
0
AMCR
6/1/2021
SRA
0
AMCR
6/7/2021
SRA
0
AMCR
6/15/2021
SRA
0
AMCR
6/21/2021
SRA
0
AMCR
6/29/2021
SRA
0
AMCR
6/1/2021
SRC
0
AMCR
6/7/2021
SRC
0
AMCR
6/15/2021
SRC
0
AMCR
6/21/2021
SRC
0
AMCR
6/29/2021
SRC
0
AMCR
6/1/2021
MCC
0
AMCR
6/7/2021
MCC
0
AMCR
6/15/2021
MCC
0
AMCR
6/21/2021
MCC
0
AMCR
6/29/2021
MCC
0
是否可以通过检查这些日期和网站组合是否不存在来添加计数为 0 的行?
谢谢。
dplyr/tidyr
library(dplyr)
library(tidyr)
dat %>%
complete(Species, Date, Site, fill = list(n = 0))
# # A tibble: 15 x 4
# Species Date Site n
# <chr> <chr> <chr> <dbl>
# 1 AMCR 6/1/2021 BMA 1
# 2 AMCR 6/1/2021 SVA 0
# 3 AMCR 6/1/2021 SVC 14
# 4 AMCR 6/15/2021 BMA 0
# 5 AMCR 6/15/2021 SVA 9
# 6 AMCR 6/15/2021 SVC 0
# 7 AMCR 6/21/2021 BMA 0
# 8 AMCR 6/21/2021 SVA 18
# 9 AMCR 6/21/2021 SVC 0
# 10 AMCR 6/29/2021 BMA 0
# 11 AMCR 6/29/2021 SVA 18
# 12 AMCR 6/29/2021 SVC 0
# 13 AMCR 6/7/2021 BMA 0
# 14 AMCR 6/7/2021 SVA 2
# 15 AMCR 6/7/2021 SVC 0
基础 R
dat2 <- merge(dat, do.call(expand.grid, lapply(dat[,1:3], unique)), by = names(dat)[1:3], all = TRUE)
dat2
# Species Date Site n
# 1 AMCR 6/1/2021 BMA 1
# 2 AMCR 6/1/2021 SVA NA
# 3 AMCR 6/1/2021 SVC 14
# 4 AMCR 6/15/2021 BMA NA
# 5 AMCR 6/15/2021 SVA 9
# 6 AMCR 6/15/2021 SVC NA
# 7 AMCR 6/21/2021 BMA NA
# 8 AMCR 6/21/2021 SVA 18
# 9 AMCR 6/21/2021 SVC NA
# 10 AMCR 6/29/2021 BMA NA
# 11 AMCR 6/29/2021 SVA 18
# 12 AMCR 6/29/2021 SVC NA
# 13 AMCR 6/7/2021 BMA NA
# 14 AMCR 6/7/2021 SVA 2
# 15 AMCR 6/7/2021 SVC NA
dat2$n <- ifelse(is.na(dat2$n), 0, dat2$n)
dat2
# Species Date Site n
# 1 AMCR 6/1/2021 BMA 1
# 2 AMCR 6/1/2021 SVA 0
# 3 AMCR 6/1/2021 SVC 14
# 4 AMCR 6/15/2021 BMA 0
# 5 AMCR 6/15/2021 SVA 9
# 6 AMCR 6/15/2021 SVC 0
# 7 AMCR 6/21/2021 BMA 0
# 8 AMCR 6/21/2021 SVA 18
# 9 AMCR 6/21/2021 SVC 0
# 10 AMCR 6/29/2021 BMA 0
# 11 AMCR 6/29/2021 SVA 18
# 12 AMCR 6/29/2021 SVC 0
# 13 AMCR 6/7/2021 BMA 0
# 14 AMCR 6/7/2021 SVA 2
# 15 AMCR 6/7/2021 SVC 0
数据
dat <- structure(list(Species = c("AMCR", "AMCR", "AMCR", "AMCR", "AMCR", "AMCR"), Date = c("6/1/2021", "6/1/2021", "6/7/2021", "6/15/2021", "6/21/2021", "6/29/2021"), Site = c("SVC", "BMA", "SVA", "SVA", "SVA", "SVA"), n = c(14L, 1L, 2L, 9L, 18L, 18L)), class = "data.frame", row.names = c(NA, -6L))
我有一个包含计数数据的数据集 ('DF1')。它看起来像这样:
Species | Date | Site | n |
---|---|---|---|
AMCR | 6/1/2021 | SVC | 14 |
AMCR | 6/1/2021 | BMA | 1 |
AMCR | 6/7/2021 | SVA | 2 |
AMCR | 6/15/2021 | SVA | 9 |
AMCR | 6/21/2021 | SVA | 18 |
AMCR | 6/29/2021 | SVA | 18 |
然而,我的研究实际上有九个 'Sites'(SVC、BMA、SVA、BMC、TMA、TMC、SRA、SRC 和 MCC)并且每个站点都在相同的五个日期(6 /1/2021、6/8/2021、6/15/2021、6/21/2021 和 6/29/2021)。 DF1 仅显示 'n' 中有计数的行,但如果没有计数,我希望数据框为每个站点的每个日期填充零计数,以便它看起来像这样:
Species | Date | Site | n |
---|---|---|---|
AMCR | 6/1/2021 | SVC | 14 |
AMCR | 6/7/2021 | SVC | 0 |
AMCR | 6/15/2021 | SVC | 0 |
AMCR | 6/21/2021 | SVC | 0 |
AMCR | 6/29/2021 | SVC | 0 |
AMCR | 6/1/2021 | BMA | 1 |
AMCR | 6/7/2021 | BMA | 0 |
AMCR | 6/15/2021 | BMA | 0 |
AMCR | 6/21/2021 | BMA | 0 |
AMCR | 6/29/2021 | BMA | 0 |
AMCR | 6/1/2021 | SVA | 0 |
AMCR | 6/7/2021 | SVA | 2 |
AMCR | 6/15/2021 | SVA | 9 |
AMCR | 6/21/2021 | SVA | 18 |
AMCR | 6/29/2021 | SVA | 18 |
AMCR | 6/1/2021 | BMC | 0 |
AMCR | 6/7/2021 | BMC | 0 |
AMCR | 6/15/2021 | BMC | 0 |
AMCR | 6/21/2021 | BMC | 0 |
AMCR | 6/29/2021 | BMC | 0 |
AMCR | 6/1/2021 | TMA | 0 |
AMCR | 6/7/2021 | TMA | 0 |
AMCR | 6/15/2021 | TMA | 0 |
AMCR | 6/21/2021 | TMA | 0 |
AMCR | 6/29/2021 | TMA | 0 |
AMCR | 6/1/2021 | TMC | 0 |
AMCR | 6/7/2021 | TMC | 0 |
AMCR | 6/15/2021 | TMC | 0 |
AMCR | 6/21/2021 | TMC | 0 |
AMCR | 6/29/2021 | TMC | 0 |
AMCR | 6/1/2021 | SRA | 0 |
AMCR | 6/7/2021 | SRA | 0 |
AMCR | 6/15/2021 | SRA | 0 |
AMCR | 6/21/2021 | SRA | 0 |
AMCR | 6/29/2021 | SRA | 0 |
AMCR | 6/1/2021 | SRC | 0 |
AMCR | 6/7/2021 | SRC | 0 |
AMCR | 6/15/2021 | SRC | 0 |
AMCR | 6/21/2021 | SRC | 0 |
AMCR | 6/29/2021 | SRC | 0 |
AMCR | 6/1/2021 | MCC | 0 |
AMCR | 6/7/2021 | MCC | 0 |
AMCR | 6/15/2021 | MCC | 0 |
AMCR | 6/21/2021 | MCC | 0 |
AMCR | 6/29/2021 | MCC | 0 |
是否可以通过检查这些日期和网站组合是否不存在来添加计数为 0 的行?
谢谢。
dplyr/tidyr
library(dplyr)
library(tidyr)
dat %>%
complete(Species, Date, Site, fill = list(n = 0))
# # A tibble: 15 x 4
# Species Date Site n
# <chr> <chr> <chr> <dbl>
# 1 AMCR 6/1/2021 BMA 1
# 2 AMCR 6/1/2021 SVA 0
# 3 AMCR 6/1/2021 SVC 14
# 4 AMCR 6/15/2021 BMA 0
# 5 AMCR 6/15/2021 SVA 9
# 6 AMCR 6/15/2021 SVC 0
# 7 AMCR 6/21/2021 BMA 0
# 8 AMCR 6/21/2021 SVA 18
# 9 AMCR 6/21/2021 SVC 0
# 10 AMCR 6/29/2021 BMA 0
# 11 AMCR 6/29/2021 SVA 18
# 12 AMCR 6/29/2021 SVC 0
# 13 AMCR 6/7/2021 BMA 0
# 14 AMCR 6/7/2021 SVA 2
# 15 AMCR 6/7/2021 SVC 0
基础 R
dat2 <- merge(dat, do.call(expand.grid, lapply(dat[,1:3], unique)), by = names(dat)[1:3], all = TRUE)
dat2
# Species Date Site n
# 1 AMCR 6/1/2021 BMA 1
# 2 AMCR 6/1/2021 SVA NA
# 3 AMCR 6/1/2021 SVC 14
# 4 AMCR 6/15/2021 BMA NA
# 5 AMCR 6/15/2021 SVA 9
# 6 AMCR 6/15/2021 SVC NA
# 7 AMCR 6/21/2021 BMA NA
# 8 AMCR 6/21/2021 SVA 18
# 9 AMCR 6/21/2021 SVC NA
# 10 AMCR 6/29/2021 BMA NA
# 11 AMCR 6/29/2021 SVA 18
# 12 AMCR 6/29/2021 SVC NA
# 13 AMCR 6/7/2021 BMA NA
# 14 AMCR 6/7/2021 SVA 2
# 15 AMCR 6/7/2021 SVC NA
dat2$n <- ifelse(is.na(dat2$n), 0, dat2$n)
dat2
# Species Date Site n
# 1 AMCR 6/1/2021 BMA 1
# 2 AMCR 6/1/2021 SVA 0
# 3 AMCR 6/1/2021 SVC 14
# 4 AMCR 6/15/2021 BMA 0
# 5 AMCR 6/15/2021 SVA 9
# 6 AMCR 6/15/2021 SVC 0
# 7 AMCR 6/21/2021 BMA 0
# 8 AMCR 6/21/2021 SVA 18
# 9 AMCR 6/21/2021 SVC 0
# 10 AMCR 6/29/2021 BMA 0
# 11 AMCR 6/29/2021 SVA 18
# 12 AMCR 6/29/2021 SVC 0
# 13 AMCR 6/7/2021 BMA 0
# 14 AMCR 6/7/2021 SVA 2
# 15 AMCR 6/7/2021 SVC 0
数据
dat <- structure(list(Species = c("AMCR", "AMCR", "AMCR", "AMCR", "AMCR", "AMCR"), Date = c("6/1/2021", "6/1/2021", "6/7/2021", "6/15/2021", "6/21/2021", "6/29/2021"), Site = c("SVC", "BMA", "SVA", "SVA", "SVA", "SVA"), n = c(14L, 1L, 2L, 9L, 18L, 18L)), class = "data.frame", row.names = c(NA, -6L))