在 r 中的拆分组中创建组合

Create combinations within a split group in r

使用下面的位置、天数和数量数据框,我正在寻找一种解决方案,以在每一天按位置创建数量组合。在生产中,这些组合可能会变得非常大,因此 data.table 或 plyr 方法将受到赞赏。

library(gtools)    
dat <- data.frame(Loc = c(51,51,51,51,51), Day = c("Mon","Mon","Tue","Tue","Wed"),
Qty = c(1,2,3,4,5))

这个例子的输出应该是:

  Loc Day  Qty
1  51 Mon   1
2  51 Tue   3
3  51 Wed   5

4  51 Mon   1
5  51 Tue   4
6  51 Wed   5

7  51 Mon   2
8  51 Tue   3
9  51 Wed   5

10  51 Mon  2
11  51 Tue  4
12  51 Wed  5

我已经尝试了一些嵌套的 lapply,这让我很接近,但是我不确定如何将它带到下一步并在每个商店中使用 combn() 函数。

lapply(split(dat, dat$Loc), function(x) {
      lapply(split(x, x$Day), function(y) {
          y$Qty
    })                
})

如果每个 Store > Day 组都在它自己的列表中,我能够获得正确的组合,但我正在努力如何使用拆分-应用-组合方法从数据框中到达那里。

loc51_mon <- c(1,2)
loc51_tue <- c(3,4)
loc51_wed <- c(5)

unlist(lapply(loc51_mon, function(x) {
    lapply(loc51_tue, function(y) {
         lapply(loc51_wed, function(z) {
              combn(c(x,y,z), 3)
         })
    })
}), recursive = FALSE)

[[1]]
[[1]][[1]]
     [,1]
[1,]    1
[2,]    3
[3,]    5

[[2]]
[[2]][[1]]
     [,1]
[1,]    1
[2,]    4
[3,]    5

[[3]]
[[3]][[1]]
     [,1]
[1,]    2
[2,]    3
[3,]    5

[[4]]
[[4]][[1]]
     [,1]
[1,]    2
[2,]    4
[3,]    5

这应该可行,但是进一步的复杂性需要更改函数:

library(data.table) 
dat <- data.frame(Loc = c(51,51,51,51,51), Day = c("Mon","Mon","Tue","Tue","Wed"),
                  Qty = c(1,2,3,4,5), stringsAsFactors = F)
setDT(dat)

comb_in <- function(Qty_In,Day_In){
    temp_df <- aggregate(Qty_In ~ Day_In, cbind(Qty_In, as.character(Day_In)), paste, collapse = "|")
    temp_list <- strsplit(temp_df$Qty_In, split = "|", fixed = T)
    names(temp_list) <- as.character(temp_df$Day)
    melt(as.data.table(expand.grid(temp_list))[, case_group := .I], id.vars = "case_group", variable.name = "Day", value.name = "Qty")
}

dat[, comb_in(Qty_In = Qty, Day_In = Day), by = Loc][order(Loc,case_group,Day)]
    Loc case_group Day Qty
 1:  51          1 Mon   1
 2:  51          1 Tue   3
 3:  51          1 Wed   5
 4:  51          2 Mon   2
 5:  51          2 Tue   3
 6:  51          2 Wed   5
 7:  51          3 Mon   1
 8:  51          3 Tue   4
 9:  51          3 Wed   5
10:  51          4 Mon   2
11:  51          4 Tue   4
12:  51          4 Wed   5

您现在可以按 case_group 过滤以获得每个组合

这个问题和

很相似

对于一般方法(性能可能比问题指定方法慢):

permu.sets <- function(listoflist) {
    #assumes that each list within listoflist contains vectors of equal lengths
    temp <- expand.grid(listoflist)   
    do.call(cbind, lapply(temp, function(x) do.call(rbind, x)))
} #permu.sets

#for the problem posted in OP
dat <- data.frame(Loc = c(51,51,51,51,51), Day = c("Mon","Mon","Tue","Tue","Wed"),
    Qty = c(1,2,3,4,5))
vecsets <- lapply(split(dat, dat$Day), function(x) split(as.matrix(x), row(x)))
res <- permu.sets(vecsets)
lapply(split(res, seq(nrow(res))), function(x) matrix(x, ncol=3, byrow=T ))