如何使用 dplyr 或 R 中的其他方法划分行的组合？

Question

site <- rep(1:4, each = 8, len = 32)
rep <- rep(1:8, times = 4, len = 32)
treatment <- rep(c("A.low","A.low","A.high","A.high","A.mix","A.mix","B.mix","B.mix"), 4)
sp.1 <- sample(0:3,size=32,replace=TRUE)
sp.2 <- sample(0:2,size=32,replace=TRUE)
df.dummy <- data.frame(site, rep, treatment, sp.1, sp.2)

最终数据框如下所示

对于每个站点，我想总结一下各个组。例如两个：“A.low / A.high” = “sp.1/sp.1”； “A.low/A.mix”=“sp.1/sp.2”。您会注意到，每个站点都有两个，我希望在最后的专栏中显示所有排列。我的最终产品类似于：

site  rep   treatment      value
  1.  1/3.  A.low/A.high.   Inf
  1.  1/4.  A.low/A.high.   1

我开始使用 dplyr，但我真的不确定如何进行，尤其是所有组合

  df.dummy %>% 
  group_by(site) %>% 
  summarise(value.1 = sp.1[treatment = "A.low"] / sp.1[treatment = "A.high"])

Answer 1

您可以使用 reshape2 以更易于使用的格式获取数据。

下面的代码将 sp.1 和 sp.2 数据分开。使用 acast 以便每个数据框由每个站点的一行组成，并且每一列都是一个独特的样本，其值来自 sp.1 和 sp.2。

将列命名为唯一的名称并将数据框与 cbind 组合。

现在可以根据您的要求比较每一列。

library(dplyr)
library(reshape2)

##your setup
site <- rep(1:4, each = 8, len = 32)
rep <- rep(1:8, times = 4, len = 32)
treatment <- rep(c("A.low","A.low","A.high","A.high","A.mix","A.mix","B.mix","B.mix"), 4)
sp.1 <- sample(0:3,size=32,replace=TRUE)
sp.2 <- sample(0:2,size=32,replace=TRUE)
df.dummy <- data.frame(site, rep, treatment, sp.1, sp.2)

##create unique ids and create a dataframe containing 1 value column
sp1 <- df.dummy %>% mutate(id = paste(rep, treatment, sep = "_")) %>% select(id, site, rep, treatment, sp.1)
sp2 <- df.dummy %>% mutate(id = paste(rep, treatment, sep = "_")) %>% select(id, site, rep, treatment, sp.2)

##reshape the data so that each treament and replicate is assigned a single column
##each row will be a single site
##each column will contain the values from sp.1 or sp.2
sp1 <- reshape2::acast(data = sp1, formula = site ~ id)
sp2 <- reshape2::acast(data = sp2, formula = site ~ id)

##rename columns something sensible and unique
colnames(sp1) <- c("low.1.sp1", "low.2.sp1", "high.3.sp1", "high.4.sp1",
                   "mix.5.sp1", "mix.6.sp1", "mix.7.sp1", "mix.8.sp1")
colnames(sp2) <- c("low.1.sp2", "low.2.sp2", "high.3.sp2", "high.4.sp2",
                   "mix.5.sp2", "mix.6.sp2", "mix.7.sp2", "mix.8.sp2")

##combine datasets
dat <- sp1 %>% cbind(sp2)

##choose which columns to compare. Some examples shown below
dat <-  dat %>% mutate(low.1.sp1/high.3.sp1, low.1.sp1/high.4.sp1,
                       low.2.sp1/high.3.sp2)

如何使用 dplyr 或 R 中的其他方法划分行的组合？

How to divide combinations of rows using dplyr or another method in R?

combinations

r

permutation

dplyr