如何通过对每个站点的观测值求和来添加一列丰度数据?

How do I add a column of abundance data by summing observations per site?

我有一个数据框,其中包含跨多个站点对扇贝 presence/absence 的观察。我想使用 UID(唯一标识符)和 presence/absence 列(二进制:0 不存在,1 存在)来计算每个站点的扇贝数。

我的数据框如下所示:

UID Present.Absent Size.cm binary
A-10-2021 Present 4.60 1
A-10-2021 Present 6.0 1
A-11-2021 Present 4.70 1
A-11-2021 Present 4.8 1
A-4-2021 Absent NA 0
A-5-2021 Present 5.90 1
A-5-2021 Present 6.00 1
A-5-2021 Present 6.00 1
A-5-2021 Present 3.90 1
A-5-2021 Present 5.00 1
A-6-2021 Absent NA 0

它继续进行大约 6000 次观察,大约有 1500 个不同的 UID

我是 R 的新手,不知道该怎么做。有没有办法让每个 UID 一行,有一列丰度数据?非常感谢任何帮助,如果有任何其他信息有帮助,我很乐意提供。谢谢!

编辑:添加了数据样本;前 10 行

structure(list(UID = c("A-10-2021", "A-10-2021", "A-11-2021", 
"A-11-2021", "A-1-2021", "A-1-2021", "A-1-2021", "A-12-2021", 
"A-12-2021", "A-12-2021"), Present.Absent = c("Present", "Present", 
"Present", "Present", "Present", "Present", "Present", "Present", 
"Present", "Present"), Alive.Dead = c("Alive", "Alive", "Alive", 
"Alive", "Alive", "Alive", "Alive", "Alive", "Alive", "Alive"
), Size.cm = c(4.6, 5.25, 4.7, 5.1, 3.5, 3.9, 4.7, 4.7, 4.9, 
4.9), binary = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1)), row.names = c(3L, 
4L, 9L, 10L, 14L, 15L, 17L, 36L, 37L, 38L), class = "data.frame")

您可以使用 group_by() 来实现:

# Your data
temp1 <- structure(list(UID = c("A-10-2021", "A-10-2021", "A-11-2021", 
"A-11-2021", "A-1-2021", "A-1-2021", "A-1-2021", "A-12-2021", 
"A-12-2021", "A-12-2021"), Present.Absent = c("Present", "Present", 
"Present", "Present", "Present", "Present", "Present", "Present", 
"Present", "Present"), Alive.Dead = c("Alive", "Alive", "Alive", 
"Alive", "Alive", "Alive", "Alive", "Alive", "Alive", "Alive"
), Size.cm = c(4.6, 5.25, 4.7, 5.1, 3.5, 3.9, 4.7, 4.7, 4.9, 
4.9), id = c(3L, 4L, 9L, 10L, 14L, 15L, 17L, 36L, 37L, 38L)), row.names = c(3L, 
4L, 9L, 10L, 14L, 15L, 17L, 36L, 37L, 38L), class = "data.frame")

请注意,您可以先使用 mutate() 和 ifelse() 创建二进制列 (isPresent)。

library(tidyverse)

# Option 1: Create a new column with abundance, by UID, but keep the number of rows
temp1 %>% mutate(isPresent = ifelse(Present.Absent == "Present", 1, 0)) %>% group_by(UID) %>% mutate(abundance = sum(isPresent))

# Option 2: Get a summary, with one row per UID
temp1 %>% mutate(isPresent = ifelse(Present.Absent == "Present", 1, 0)) %>% group_by(UID) %>% summarise(abundance = sum(isPresent))