为什么 R 包会加载随机数?

Why would an R package load random numbers?

最近,我在阅读 caret 包的文档时注意到了这一点:

Also, please note that some packages load random numbers when loaded (directly or via namespace) and this may effect [sic] reproducibility.

加载随机数的包有哪些可能的用例?这似乎与可重复研究的想法背道而驰,并且可能会干扰我自己 set.seed 的尝试。 (我已经开始将种子设置得更接近需要随机数生成的代码,因为我担心加载包的副作用。)

执行此操作的包的一个示例是 ggplot2, as mentioned by Hadley Wickham in a response to a GitHub issue related to tidyverse

附加包裹时,会随机选择一条小费显示给用户(有一定概率不显示小费)。如果我们检查它的 .onAttach() 函数 as it existed before January 2018,我们会看到它同时调用了 runif()sample(),改变了种子:

.onAttach <- function(...) {
  if (!interactive() || stats::runif(1) > 0.1) return()

  tips <- c(
    "Need help? Try the ggplot2 mailing list: http://groups.google.com/group/ggplot2.",
    "Find out what's changed in ggplot2 at http://github.com/tidyverse/ggplot2/releases.",
    "Use suppressPackageStartupMessages() to eliminate package startup messages.",
    "Whosebug is a great place to get help: http://whosebug.com/tags/ggplot2.",
    "Need help getting started? Try the cookbook for R: http://www.cookbook-r.com/Graphs/",
    "Want to understand how all the pieces fit together? Buy the ggplot2 book: http://ggplot2.org/book/"
  )

  tip <- sample(tips, 1)
  packageStartupMessage(paste(strwrap(tip), collapse = "\n"))
}

release_questions <- function() {
  c(
    "Have you built the book?"
  )
}

但是,自 been fixed with a commit authored by Jim Hester 以来,种子在附加 ggplot2 后被重置:

.onAttach <- function(...) {
  withr::with_preserve_seed({
    if (!interactive() || stats::runif(1) > 0.1) return()

    tips <- c(
      "Need help? Try the ggplot2 mailing list: http://groups.google.com/group/ggplot2.",
      "Find out what's changed in ggplot2 at http://github.com/tidyverse/ggplot2/releases.",
      "Use suppressPackageStartupMessages() to eliminate package startup messages.",
      "Whosebug is a great place to get help: http://whosebug.com/tags/ggplot2.",
      "Need help getting started? Try the cookbook for R: http://www.cookbook-r.com/Graphs/",
      "Want to understand how all the pieces fit together? Buy the ggplot2 book: http://ggplot2.org/book/"
      )

    tip <- sample(tips, 1)
    packageStartupMessage(paste(strwrap(tip), collapse = "\n"))
  })
}

因此,包执行此操作的原因可能有多种,尽管包作者可以通过多种方式防止这种情况给用户带来意想不到的后果。