将不同长度的字符串列表组合成一个数据框

Combine lists of strings of different lengths to a data frame

我有一个文本数据需要更正英文错误。

我想要一个table的输出,第一列是错误,第二列是所有改正建议。

例如:

sentence <- "This is a word but thhis isn't and this onne as well. I need hellp"

library(hunspell)
mistakesList <- hunspell(essay)[[1]]
suggestionsList <- hunspell_suggest(mistakesList)

我试过了

do.call(rbind, Map(data.frame, A=mistakesList, B=suggestionsList))

但是 returns

            A      B
thhis   thhis   this
onne.1   onne   none
onne.2   onne    one
onne.3   onne  tonne
onne.4   onne  Donne
onne.5   onne   once
onne.6   onne   Anne
onne.7   onne Yvonne
hellp.1 hellp  hello
hellp.2 hellp   hell
hellp.3 hellp   help
hellp.4 hellp hell p

我想要一个 returns :

的数据框
mistakes suggestions
thhis   this
onne    none one tonne Donne once Anne Yvonne
hellp   hello hell help hell p

我们可以保持 mistakesList 不变,并使用 toString.

suggestionsList 转换为逗号分隔值
data.frame(mistakes = mistakesList, suggestions = sapply(suggestionsList, toString))


#  mistakes                               suggestions
#1    thhis                                      this
#2     onne none, one, tonne, Donne, once, Anne, neon
#3    hellp                 hello, hell, help, hell p

这有效:

  X1 <- do.call(rbind, Map(data.frame, mistakes = mistakesList, suggestions = suggestionsList))
  X1 

library(plyr)

  X2 <- ddply(X1, .(mistakes),summarize,
              suggestions = paste(suggestions, collapse=", "))
  X2


mistakes                                 suggestions
1 thhis                                        this
2  onne none, one, tonne, Donne, once, Anne, Yvonne
3 hellp                   hello, hell, help, hell p