用 data.table 进行更动态的熔化

more dynamic melting with data.table

我正在寻找最有效的形式进行转换

  ARTNR           FILGRP
1     1             9827
2     2             9348
3     3 9335, 9827, 9339

进入这个

  ARTNR      FILGRP
1     1      9827
2     2      9348
3     3      9335
4     3      9827
5     3      9339

我尝试了下面的代码,它可以工作,但它并不优雅,并且有一些缺点。 :

setDT(artnrs)  
artnrs[, c("P1", "P2", "P3") := tstrsplit(FILGRP, ",", fixed=TRUE)] # 1)
artnrs <- melt(artnrs, c("ARTNR"), measure = patterns("^P")) # 2)
artnrs[,variable:=NULL] # 3)
artnrs <- na.omit(artnrs, cols="value") # 4)
names(artnrs)[2] <- "FILGRP" # 5)

它基于 data.table 但性能并不是那么关键,因此易于理解的 tidyverse 解决方案就可以了。不过包越少越好。

谢谢!

dput输出;

structure(list(ARTNR = c(1, 2, 3), FILGRP = c("9827", "9348", "9335, 9827, 9339")), 
row.names = c(NA, -3L), class = "data.frame")
df <- structure(list(ARTNR = c(1, 2, 3), FILGRP = c("9827", "9348", "9335, 9827, 9339")), 
          row.names = c(NA, -3L), class = "data.frame")

df2 <- strsplit(df$FILGRP, split = ",")
df2 <- data.frame(ARTNR = rep(df$ARTNR, sapply(df2, length)), FILGRP = unlist(df2))

这里有一个data.table方法

library( data.table )
setDT(DT)

melt( DT[, paste0( "v", 1:length(tstrsplit( DT$FILGRP, ", ") ) ) := tstrsplit( FILGRP, ", ") ],
      id.vars = "ARTNR", 
      measure.vars = patterns( "^v" ),
      value.name = "FILGRP" )[!is.na(FILGRP), .SD, .SDcols = c(1,3) ]


#    ARTNR FILGRP
# 1:     1   9827
# 2:     2   9348
# 3:     3   9335
# 4:     3   9827
# 5:     3   9339