用 tidyr 循环

Looping with tidyr

我有一些数据来自 Wikipedia:

RHCP_data
                V1              V2              V3           V4
1       bar:kiedis from:01/01/1983 till:01/11/1986 color:vocals
2       bar:kiedis from:01/12/1986        till:end color:vocals
3         bar:flea from:01/01/1983        till:end   color:bass
4        bar:smith from:03/12/1988        till:end  color:drums
5  bar:klinghoffer from:01/10/2009        till:end   color:lead
6       bar:slovak from:01/01/1983 till:01/12/1983   color:lead
7       bar:slovak from:01/02/1985 till:25/06/1988   color:lead
...
...

我正在尝试使用 tidyr 删除变量名,这很有效:

separate(RHCP_data, "V1", into = c("a", "b"), sep = ":")[2]

             b
1       kiedis
2       kiedis
3         flea
4        smith
5  klinghoffer
6       slovak
7       slovak
...
...

我想了解为什么这不起作用。

for(i in 1:4){
  RHCP_data[,i] <- separate(RHCP_data, paste0("V", i), into = c("a", "b"), sep = ":")[2][,1]
}

我得到这个错误:

Error: Invalid column specification

显然数据集很小,所以在这种情况下这不是问题,但我觉得 tidyr 或循环我不明白。任何帮助表示赞赏。

我们可以简单地使用 cSplit 而无需任何循环。

library(splitstackshape)
DT <- cSplit(RHCP_data, 1:ncol(RHCP_data), ':')
DT[, seq(2, ncol(DT), by=2), with=FALSE]
#            V1_2       V2_2       V3_2   V4_2
#  1:      kiedis 01/01/1983 01/11/1986 vocals
#2:      kiedis 01/12/1986        end vocals
#3:        flea 01/01/1983        end   bass
#4:       smith 03/12/1988        end  drums
#5: klinghoffer 01/10/2009        end   lead
#6:      slovak 01/01/1983 01/12/1983   lead
#7:      slovak 01/02/1985 25/06/1988   lead

要将列作为变量传递,您需要使用 separate_ 而不是 separate

如果您想使用 for 循环,我建议:

lst = lapply(seq(ncol(df)), function(x) {
    separate_(df, paste0('V', x), into = paste0(c("a", "b"), x), sep = ":")[x:(x+1)][,2]
}) 

data.frame(setNames(lst, names(df)))
#           V1         V2         V3     V4
#1      kiedis 01/01/1983 01/11/1986 vocals
#2      kiedis 01/12/1986        end vocals
#3        flea 01/01/1983        end   bass
#4       smith 03/12/1988        end  drums
#5 klinghoffer 01/10/2009        end   lead
#6      slovak 01/01/1983 01/12/1983   lead
#7      slovak 01/02/1985 25/06/1988   lead