如何通过对每个站点的观测值求和来添加一列丰度数据？

Question

我有一个数据框，其中包含跨多个站点对扇贝 presence/absence 的观察。我想使用 UID（唯一标识符）和 presence/absence 列（二进制：0 不存在，1 存在）来计算每个站点的扇贝数。

我的数据框如下所示：

UID	Present.Absent	Size.cm	binary
A-10-2021	Present	4.60	1
A-10-2021	Present	6.0	1
A-11-2021	Present	4.70	1
A-11-2021	Present	4.8	1
A-4-2021	Absent	NA	0
A-5-2021	Present	5.90	1
A-5-2021	Present	6.00	1
A-5-2021	Present	6.00	1
A-5-2021	Present	3.90	1
A-5-2021	Present	5.00	1
A-6-2021	Absent	NA	0

它继续进行大约 6000 次观察，大约有 1500 个不同的 UID

我是 R 的新手，不知道该怎么做。有没有办法让每个 UID 一行，有一列丰度数据？非常感谢任何帮助，如果有任何其他信息有帮助，我很乐意提供。谢谢！

编辑：添加了数据样本；前 10 行

structure(list(UID = c("A-10-2021", "A-10-2021", "A-11-2021", 
"A-11-2021", "A-1-2021", "A-1-2021", "A-1-2021", "A-12-2021", 
"A-12-2021", "A-12-2021"), Present.Absent = c("Present", "Present", 
"Present", "Present", "Present", "Present", "Present", "Present", 
"Present", "Present"), Alive.Dead = c("Alive", "Alive", "Alive", 
"Alive", "Alive", "Alive", "Alive", "Alive", "Alive", "Alive"
), Size.cm = c(4.6, 5.25, 4.7, 5.1, 3.5, 3.9, 4.7, 4.7, 4.9, 
4.9), binary = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1)), row.names = c(3L, 
4L, 9L, 10L, 14L, 15L, 17L, 36L, 37L, 38L), class = "data.frame")

Answer 1

您可以使用 group_by() 来实现：

# Your data
temp1 <- structure(list(UID = c("A-10-2021", "A-10-2021", "A-11-2021", 
"A-11-2021", "A-1-2021", "A-1-2021", "A-1-2021", "A-12-2021", 
"A-12-2021", "A-12-2021"), Present.Absent = c("Present", "Present", 
"Present", "Present", "Present", "Present", "Present", "Present", 
"Present", "Present"), Alive.Dead = c("Alive", "Alive", "Alive", 
"Alive", "Alive", "Alive", "Alive", "Alive", "Alive", "Alive"
), Size.cm = c(4.6, 5.25, 4.7, 5.1, 3.5, 3.9, 4.7, 4.7, 4.9, 
4.9), id = c(3L, 4L, 9L, 10L, 14L, 15L, 17L, 36L, 37L, 38L)), row.names = c(3L, 
4L, 9L, 10L, 14L, 15L, 17L, 36L, 37L, 38L), class = "data.frame")

请注意，您可以先使用 mutate() 和 ifelse() 创建二进制列 (isPresent)。

library(tidyverse)

# Option 1: Create a new column with abundance, by UID, but keep the number of rows
temp1 %>% mutate(isPresent = ifelse(Present.Absent == "Present", 1, 0)) %>% group_by(UID) %>% mutate(abundance = sum(isPresent))

# Option 2: Get a summary, with one row per UID
temp1 %>% mutate(isPresent = ifelse(Present.Absent == "Present", 1, 0)) %>% group_by(UID) %>% summarise(abundance = sum(isPresent))

如何通过对每个站点的观测值求和来添加一列丰度数据？

How do I add a column of abundance data by summing observations per site?

r

sum

count