foreach 和 doparallel 运行 在 R 中没有问题,但没有得到任何正确的结果
foreach and doparallel ran with no issue in R but did not get any correct result
我正在尝试创建一个 foreach
来修复更大数据框的拼写错误单词替换。我的代码 运行 没有问题,但我没有看到正确的结果。请在下面查看我的数据框示例和我使用的代码。
我有一个主数据框和一个用于从主数据框中查找和替换预定义拼写错误文本的数据框:
#create main data frame
df <- data.frame("Index" = 1:7, "Text" = c("Brad came to dinner with us tonigh.",
"Wuld you like to trave with me?",
"There is so muh to undestand.",
"Sentences cone in many shaes and sizes.",
"Learnin R is fun",
"yesterday was Friday",
"bing search engine"), stringsAsFactors = FALSE)
#create predefined misspelled data frame
df_r <- data.frame("misspelled" = c("tonigh", "Wuld", "trave", "muh", "undestand", "shaes", "Learnin"),
"correction" = c("tonight", "Would", "travel", "much", "understand", "shapes", "Learning"))
library(DataCombine)
library(doParallel)
library(foreach)
no_cores <- detectCores()
cl <- makeCluster(no_cores[1]-1)
registerDoParallel(cl)
df_replacement <- foreach((df$Text), .combine = cbind) %dopar% {
replacement = DataCombine::FindReplace(data = df, Var = "Text", replaceData = df_r,
from = "misspelled", to = "correction", exact = FALSE)
replacement
}
stopCluster(cl)
我不确定我在 foreach
部分做错了什么。任何建议表示赞赏。
我想你正在寻找这个:
df_replacement <- foreach(i = (rownames(df)), .combine = rbind) %dopar% {
replacement = DataCombine::FindReplace(data = df[i,], Var = "Text", replaceData = df_r,
from = "misspelled", to = "correction", exact = FALSE)
replacement
}
发生了什么:
Foreach 知道它必须 运行 i 行长。但是你的函数总是调用整个!数据框。所以输出也是整个数据帧,每个循环都有两列。 .combine=cbind
按列组合数据帧.... 2(columns)*7(cores) = 14。因此请确保您的 FindReplace 只调用您想要的行,而不是每个循环中的整个数据帧。
我通过在您的 FindReplace
中调用每次迭代 df[i,]
的行来编辑它。此外,我将 cbind
更改为 rbind
,因为您之后要添加行而不是列。
我正在尝试创建一个 foreach
来修复更大数据框的拼写错误单词替换。我的代码 运行 没有问题,但我没有看到正确的结果。请在下面查看我的数据框示例和我使用的代码。
我有一个主数据框和一个用于从主数据框中查找和替换预定义拼写错误文本的数据框:
#create main data frame
df <- data.frame("Index" = 1:7, "Text" = c("Brad came to dinner with us tonigh.",
"Wuld you like to trave with me?",
"There is so muh to undestand.",
"Sentences cone in many shaes and sizes.",
"Learnin R is fun",
"yesterday was Friday",
"bing search engine"), stringsAsFactors = FALSE)
#create predefined misspelled data frame
df_r <- data.frame("misspelled" = c("tonigh", "Wuld", "trave", "muh", "undestand", "shaes", "Learnin"),
"correction" = c("tonight", "Would", "travel", "much", "understand", "shapes", "Learning"))
library(DataCombine)
library(doParallel)
library(foreach)
no_cores <- detectCores()
cl <- makeCluster(no_cores[1]-1)
registerDoParallel(cl)
df_replacement <- foreach((df$Text), .combine = cbind) %dopar% {
replacement = DataCombine::FindReplace(data = df, Var = "Text", replaceData = df_r,
from = "misspelled", to = "correction", exact = FALSE)
replacement
}
stopCluster(cl)
我不确定我在 foreach
部分做错了什么。任何建议表示赞赏。
我想你正在寻找这个:
df_replacement <- foreach(i = (rownames(df)), .combine = rbind) %dopar% {
replacement = DataCombine::FindReplace(data = df[i,], Var = "Text", replaceData = df_r,
from = "misspelled", to = "correction", exact = FALSE)
replacement
}
发生了什么:
Foreach 知道它必须 运行 i 行长。但是你的函数总是调用整个!数据框。所以输出也是整个数据帧,每个循环都有两列。 .combine=cbind
按列组合数据帧.... 2(columns)*7(cores) = 14。因此请确保您的 FindReplace 只调用您想要的行,而不是每个循环中的整个数据帧。
我通过在您的 FindReplace
中调用每次迭代 df[i,]
的行来编辑它。此外,我将 cbind
更改为 rbind
,因为您之后要添加行而不是列。