R:用定界符转置和拆分一行。

R: transposing and splitting a row with a delimiter.

我有一个table

rawData <- as.data.frame(matrix(c(1,2,3,4,5,6,"a,b,c","d,e","f"),nrow=3,ncol=3))

 1  4 a,b,c
 2  5   d,e
 3  6     f

我想转换为

1  2  3
4  5  6
a  d  f
b  e
c

到目前为止我可以转置和拆分第三列,但是,我不知道如何使用上面的格式概述重建新的 table?

new = t(rawData)

for (e in 1:ncol(new)){
  s<-strsplit(new[3:3,e], split=",")
  print(s)
}

我尝试为每次迭代创建新的向量,但我不确定如何有效地将每个向量放回数据框中。将不胜感激任何帮助。谢谢!

您可以使用 stringi 包中的 stri_list2matrix

library(stringi)       
rawData <- as.data.frame(matrix(c(1,2,3,4,5,6,"a,b,c","d,e","f"),nrow=3,ncol=3),stringsAsFactors = F)

d1 <- t(rawData[,1:2])
rownames(d1) <- NULL

d2 <- stri_list2matrix(strsplit(rawData$V3,split=','))

rbind(d1,d2)  
#    [,1] [,2] [,3]
# [1,] "1"  "2"  "3" 
# [2,] "4"  "5"  "6" 
# [3,] "a"  "d"  "f" 
# [4,] "b"  "e"  NA  
# [5,] "c"  NA   NA  

您也可以使用我的 "splitstackshape" 包中的 cSplit

默认情况下,它只是在拆分输入后创建额外的列:

library(splitstackshape)
cSplit(rawData, "V3")
#    V1 V2 V3_1 V3_2 V3_3
# 1:  1  4    a    b    c
# 2:  2  5    d    e   NA
# 3:  3  6    f   NA   NA

您可以将其转置以获得所需的输出。

t(cSplit(rawData, "V3"))
#      [,1] [,2] [,3]
# V1   "1"  "2"  "3" 
# V2   "4"  "5"  "6" 
# V3_1 "a"  "d"  "f" 
# V3_2 "b"  "e"  NA  
# V3_3 "c"  NA   NA