如何使 R foreach 线程写入同一个日志文件

How to make R foreach threads write to same log file

我有许多冗长的作业,我想与 foreach-dopar 并行化,以便每个线程独立于其他线程工作。我想通过使用 sink 写入日志文件来跟踪每个线程的状态(有些可能会失败,而有些可能会成功)。以下显然不起作用;日志文件只有一个条目。

library(foreach)
library(doParallel)
library(doSNOW)

cl = makeCluster(2, type="SOCK")
registerDoSNOW(cl)
dl = file("runlog.Rout", open="wt")
sink(dl, type="output",  append=TRUE)
sink(dl, type="message", append=TRUE)
dump <- foreach(i=1:5, 
            .errorhandling = "stop",
            .verbose=TRUE) %dopar% 
{
    beg.time = Sys.time()
    cat(as.character(beg.time), " I am running....\n", file="mylog.txt")
    # do something here.....
    end.time = Sys.time()
    del.tm = difftime(end.time, beg.time, units="mins")  
    cat("....saving output to file......\n\n", file="mylog.txt")
    save(del.tm, file = paste("I:/Rhome/H", i, ".RData", sep=""))
    return(i)
}
stopCluster(cl)
sink(type="output")
sink(type="message")

日志文件只有一行:

....saving output to file......

出了什么问题?

虽然我不太相信让多个进程写入同一个文件,但使用 append=TRUE 选项您可能会成功:

cat("...\n", file="mylog.txt", append=TRUE)

如果不设置这个选项,cat每次调用都会覆盖"mylog.txt"之前的内容。

有关其他方法,请参阅 my answer here.

您也可以使用参数 outfile 调用 makeCluster。从文档中,outfile

Where to direct the stdout and stderr connection output from the workers. "" indicates no redirection (which may only be useful for workers on the local machine). Defaults to ‘/dev/null’ (‘nul:’ on Windows). The other possibility is a file path on the worker's host. Files will be opened in append mode, as all workers log to the same file.