如何使用 dplyr 或 R 中的其他方法划分行的组合?
How to divide combinations of rows using dplyr or another method in R?
site <- rep(1:4, each = 8, len = 32)
rep <- rep(1:8, times = 4, len = 32)
treatment <- rep(c("A.low","A.low","A.high","A.high","A.mix","A.mix","B.mix","B.mix"), 4)
sp.1 <- sample(0:3,size=32,replace=TRUE)
sp.2 <- sample(0:2,size=32,replace=TRUE)
df.dummy <- data.frame(site, rep, treatment, sp.1, sp.2)
最终数据框如下所示
对于每个站点,我想总结一下各个组。例如两个:“A.low / A.high” = “sp.1/sp.1”; “A.low/A.mix”=“sp.1/sp.2”。您会注意到,每个站点都有两个,我希望在最后的专栏中显示所有排列。我的最终产品类似于:
site rep treatment value
1. 1/3. A.low/A.high. Inf
1. 1/4. A.low/A.high. 1
我开始使用 dplyr,但我真的不确定如何进行,尤其是所有组合
df.dummy %>%
group_by(site) %>%
summarise(value.1 = sp.1[treatment = "A.low"] / sp.1[treatment = "A.high"])
您可以使用 reshape2 以更易于使用的格式获取数据。
下面的代码将 sp.1 和 sp.2 数据分开。使用 acast
以便每个数据框由每个站点的一行组成,并且每一列都是一个独特的样本,其值来自 sp.1 和 sp.2。
将列命名为唯一的名称并将数据框与 cbind
组合。
现在可以根据您的要求比较每一列。
library(dplyr)
library(reshape2)
##your setup
site <- rep(1:4, each = 8, len = 32)
rep <- rep(1:8, times = 4, len = 32)
treatment <- rep(c("A.low","A.low","A.high","A.high","A.mix","A.mix","B.mix","B.mix"), 4)
sp.1 <- sample(0:3,size=32,replace=TRUE)
sp.2 <- sample(0:2,size=32,replace=TRUE)
df.dummy <- data.frame(site, rep, treatment, sp.1, sp.2)
##create unique ids and create a dataframe containing 1 value column
sp1 <- df.dummy %>% mutate(id = paste(rep, treatment, sep = "_")) %>% select(id, site, rep, treatment, sp.1)
sp2 <- df.dummy %>% mutate(id = paste(rep, treatment, sep = "_")) %>% select(id, site, rep, treatment, sp.2)
##reshape the data so that each treament and replicate is assigned a single column
##each row will be a single site
##each column will contain the values from sp.1 or sp.2
sp1 <- reshape2::acast(data = sp1, formula = site ~ id)
sp2 <- reshape2::acast(data = sp2, formula = site ~ id)
##rename columns something sensible and unique
colnames(sp1) <- c("low.1.sp1", "low.2.sp1", "high.3.sp1", "high.4.sp1",
"mix.5.sp1", "mix.6.sp1", "mix.7.sp1", "mix.8.sp1")
colnames(sp2) <- c("low.1.sp2", "low.2.sp2", "high.3.sp2", "high.4.sp2",
"mix.5.sp2", "mix.6.sp2", "mix.7.sp2", "mix.8.sp2")
##combine datasets
dat <- sp1 %>% cbind(sp2)
##choose which columns to compare. Some examples shown below
dat <- dat %>% mutate(low.1.sp1/high.3.sp1, low.1.sp1/high.4.sp1,
low.2.sp1/high.3.sp2)
site <- rep(1:4, each = 8, len = 32)
rep <- rep(1:8, times = 4, len = 32)
treatment <- rep(c("A.low","A.low","A.high","A.high","A.mix","A.mix","B.mix","B.mix"), 4)
sp.1 <- sample(0:3,size=32,replace=TRUE)
sp.2 <- sample(0:2,size=32,replace=TRUE)
df.dummy <- data.frame(site, rep, treatment, sp.1, sp.2)
最终数据框如下所示
对于每个站点,我想总结一下各个组。例如两个:“A.low / A.high” = “sp.1/sp.1”; “A.low/A.mix”=“sp.1/sp.2”。您会注意到,每个站点都有两个,我希望在最后的专栏中显示所有排列。我的最终产品类似于:
site rep treatment value
1. 1/3. A.low/A.high. Inf
1. 1/4. A.low/A.high. 1
我开始使用 dplyr,但我真的不确定如何进行,尤其是所有组合
df.dummy %>%
group_by(site) %>%
summarise(value.1 = sp.1[treatment = "A.low"] / sp.1[treatment = "A.high"])
您可以使用 reshape2 以更易于使用的格式获取数据。
下面的代码将 sp.1 和 sp.2 数据分开。使用 acast
以便每个数据框由每个站点的一行组成,并且每一列都是一个独特的样本,其值来自 sp.1 和 sp.2。
将列命名为唯一的名称并将数据框与 cbind
组合。
现在可以根据您的要求比较每一列。
library(dplyr)
library(reshape2)
##your setup
site <- rep(1:4, each = 8, len = 32)
rep <- rep(1:8, times = 4, len = 32)
treatment <- rep(c("A.low","A.low","A.high","A.high","A.mix","A.mix","B.mix","B.mix"), 4)
sp.1 <- sample(0:3,size=32,replace=TRUE)
sp.2 <- sample(0:2,size=32,replace=TRUE)
df.dummy <- data.frame(site, rep, treatment, sp.1, sp.2)
##create unique ids and create a dataframe containing 1 value column
sp1 <- df.dummy %>% mutate(id = paste(rep, treatment, sep = "_")) %>% select(id, site, rep, treatment, sp.1)
sp2 <- df.dummy %>% mutate(id = paste(rep, treatment, sep = "_")) %>% select(id, site, rep, treatment, sp.2)
##reshape the data so that each treament and replicate is assigned a single column
##each row will be a single site
##each column will contain the values from sp.1 or sp.2
sp1 <- reshape2::acast(data = sp1, formula = site ~ id)
sp2 <- reshape2::acast(data = sp2, formula = site ~ id)
##rename columns something sensible and unique
colnames(sp1) <- c("low.1.sp1", "low.2.sp1", "high.3.sp1", "high.4.sp1",
"mix.5.sp1", "mix.6.sp1", "mix.7.sp1", "mix.8.sp1")
colnames(sp2) <- c("low.1.sp2", "low.2.sp2", "high.3.sp2", "high.4.sp2",
"mix.5.sp2", "mix.6.sp2", "mix.7.sp2", "mix.8.sp2")
##combine datasets
dat <- sp1 %>% cbind(sp2)
##choose which columns to compare. Some examples shown below
dat <- dat %>% mutate(low.1.sp1/high.3.sp1, low.1.sp1/high.4.sp1,
low.2.sp1/high.3.sp2)