Packrat 的 use.cache 功能如何工作?

How does the use.cache feature of Packrat work?

Packrat 有一个 use.cache 功能可以减少包安装时间。

文档提供了以下信息:

use.cache: Install packages into a global cache, which is then shared across projects? The directory to use is read through Sys.getenv("R_PACKRAT_CACHE_DIR"). Not yet implemented for Windows. (logical; defaults to FALSE)

但是,运行 install.package() 不会从用户库中获取现成的安装包。

use.cache 是如何工作的?

在启用全局缓存的情况下安装

使用以下命令使用 packrat 设置缓存:

#Optional to set location of cache:
#Sys.setenv(R_PACKRAT_CACHE_DIR = "/home/willbowditch/R/packratcache")

packrat::set_opts(use.cache=TRUE)

这里写入packrat.opts,决定在Rstudio中打开项目时是否使用缓存

auto.snapshot: TRUE
use.cache: TRUE
print.banner.on.startup: auto
vcs.ignore.lib: TRUE
vcs.ignore.src: FALSE
external.packages:
local.repos:
load.external.packages.on.startup: TRUE
ignored.packages:
quiet.package.installation: TRUE
snapshot.recommended.packages: FALSE
snapshot.fields:
    Imports
    Depends
    LinkingTo

基础库和已安装的库都存储在缓存中并进行符号链接:

./packrat/lib/x86_64-pc-linux-gnu/3.4.0:
total 2
drwxr-xr-x 2 willbowditch staff  4 Jun 14 16:21 .
drwxr-xr-x 3 willbowditch staff  3 Jun 14 16:20 ..
lrwxrwxrwx 1 willbowditch staff 99 Jun 14 16:21 CheckDigit -> /home/willbowditch/R/packratcache/v2/library/CheckDigit/0ab3083cafb11382646fdda41ddb8b98/CheckDigit
lrwxrwxrwx 1 willbowditch staff 93 Jun 14 16:21 packrat -> /home/willbowditch/R/packratcache/v2/library/packrat/6ad605ba7b4b476d84be6632393f5765/packrat

./packrat/lib-ext:
total 9
drwxr-xr-x 2 willbowditch staff 2 Jun 14 16:20 .
drwxr-xr-x 6 willbowditch staff 9 Jun 14 16:20 ..

./packrat/lib-R:
total 24
drwxr-xr-x 2 willbowditch staff 16 Jun 14 16:20 .
drwxr-xr-x 6 willbowditch staff  9 Jun 14 16:20 ..
lrwxrwxrwx 1 willbowditch staff 29 Jun 14 16:20 base -> /usr/local/lib/R/library/base
lrwxrwxrwx 1 willbowditch staff 33 Jun 14 16:20 compiler -> /usr/local/lib/R/library/compiler
lrwxrwxrwx 1 willbowditch staff 33 Jun 14 16:20 datasets -> /usr/local/lib/R/library/datasets
lrwxrwxrwx 1 willbowditch staff 33 Jun 14 16:20 graphics -> /usr/local/lib/R/library/graphics
lrwxrwxrwx 1 willbowditch staff 34 Jun 14 16:20 grDevices -> /usr/local/lib/R/library/grDevices
lrwxrwxrwx 1 willbowditch staff 29 Jun 14 16:20 grid -> /usr/local/lib/R/library/grid
lrwxrwxrwx 1 willbowditch staff 32 Jun 14 16:20 methods -> /usr/local/lib/R/library/methods
lrwxrwxrwx 1 willbowditch staff 33 Jun 14 16:20 parallel -> /usr/local/lib/R/library/parallel
lrwxrwxrwx 1 willbowditch staff 32 Jun 14 16:20 splines -> /usr/local/lib/R/library/splines
lrwxrwxrwx 1 willbowditch staff 30 Jun 14 16:20 stats -> /usr/local/lib/R/library/stats
lrwxrwxrwx 1 willbowditch staff 31 Jun 14 16:20 stats4 -> /usr/local/lib/R/library/stats4
lrwxrwxrwx 1 willbowditch staff 30 Jun 14 16:20 tcltk -> /usr/local/lib/R/library/tcltk
lrwxrwxrwx 1 willbowditch staff 30 Jun 14 16:20 tools -> /usr/local/lib/R/library/tools
lrwxrwxrwx 1 willbowditch staff 30 Jun 14 16:20 utils -> /usr/local/lib/R/library/utils

如果您尝试安装一个包,它会覆盖符号链接,而不是从缓存中获取包,因此它不能用于加速包的安装。

>install.packages('CheckDigit')
Installing package into ‘/home/willbowditch/packrattest/packrat/lib/x86_64-pc-linux-gnu/3.4.0’
(as ‘lib’ is unspecified)
trying URL 'https://mran.microsoft.com/snapshot/2017-06-07/src/contrib/CheckDigit_0.1-1.tar.gz'
Content type 'application/octet-stream' length 3777 bytes
==================================================
downloaded 3777 bytes

* installing *source* package ‘CheckDigit’ ...
** package ‘CheckDigit’ successfully unpacked and MD5 sums checked
** R
** inst
** preparing package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded
* DONE (CheckDigit)

The downloaded source packages are in
    ‘/tmp/RtmpxAU8pv/downloaded_packages’

但它 确实 加快了您正在处理的 packrat 项目的启动速度,如果包是 requirelibrary 当前目录中的调用.在这种情况下,packrat::init()packrat::restore() 从缓存中恢复包,但前提是这些包之前已经在启用缓存的 Packrat 项目中使用过。

> packrat::init()
Initializing packrat project in directory:
- "~/six"
Fetching sources for BH (1.62.0-1) ... OK (CRAN current)
Fetching sources for DBI (0.6-1) ... OK (CRAN current)
Fetching sources for R6 (2.2.0) ... OK (CRAN current)
Fetching sources for Rcpp (0.12.10) ... OK (CRAN current)
Fetching sources for assertthat (0.2.0) ... OK (CRAN current)
Fetching sources for dplyr (0.5.0) ... OK (CRAN current)
Fetching sources for lazyeval (0.2.0) ... OK (CRAN current)
Fetching sources for magrittr (1.5) ... OK (CRAN current)
Fetching sources for packrat (0.4.8-1) ... OK (CRAN current)
Fetching sources for stringi (1.1.5) ... OK (CRAN current)
Fetching sources for tibble (1.3.0) ... OK (CRAN current)
Fetching sources for tidyr (0.6.2) ... OK (CRAN current)
Fetching sources for whisker (0.3-2) ... OK (CRAN current)
Snapshot written to '/home/willbowditch/six/packrat/packrat.lock'
Installing BH (1.62.0-1) ... 
    OK (symlinked cache)
Installing DBI (0.6-1) ... 
    OK (symlinked cache)
Installing R6 (2.2.0) ... 
    OK (symlinked cache)
Installing Rcpp (0.12.10) ... 
    OK (symlinked cache)
Installing assertthat (0.2.0) ... 
    OK (symlinked cache)
Installing lazyeval (0.2.0) ... 
    OK (symlinked cache)
Installing magrittr (1.5) ... 
    OK (symlinked cache)
Installing packrat (0.4.8-1) ... 
    OK (symlinked cache)
Installing stringi (1.1.5) ... 
    OK (symlinked cache)
Installing whisker (0.3-2) ... 
    OK (symlinked cache)
Installing tibble (1.3.0) ... 
    OK (symlinked cache)
Installing dplyr (0.5.0) ... 
    OK (symlinked cache)
Installing tidyr (0.6.2) ... 
    OK (symlinked cache)
Initialization complete!

换句话说,包似乎不会从全局库转到缓存,但它们可以从其他 Packrat 库转到缓存。

正在从用户主页 (~) 库快速将软件包安装到 Packrat 项目

据我所知,您不能使用尚未安装在 packrat 中的软件包来通过缓存选项缩短加载时间。从源代码安装大型软件包(例如 tidyverse)时可能会出现问题(在 Linux 系统上必须这样做)。

有几个解决方法:

解决方法 1:符号链接您的库

一个简单的解决方法是将用户包库符号链接到一个空的 packrat 目录。通过这种方法安装时间是几秒钟,它似乎不会干扰创建快照的过程,只要 packrat::clean() 在开发结束时是 运行。

步骤

新建项目 > 使用 packrat

source('https://raw.githubusercontent.com/willbowditch/ratpack/master/R/ratpack.R')
symlink_packages()
#Develop as normal then run 
packrat::clean()
packrat::snapshot(ignore.stale=TRUE) 

解决方法 2:external.packages

Packrat 确实使用 packrat::set_opts(external.packages=c('pkgname')) 命令为大型软件包提供了解决方法,但是 以这种方式安装的软件包不包含在 packrat/src 文件夹中.

实际上,该选项将包目录符号链接到 packrat/lib-ext 目录。

我试过自动执行此操作,就像符号链接选项一样 - 获取主目录中的所有用户包并将它们添加到 external.packages 选项。

步骤

新建项目 > 使用 packrat

source('https://raw.githubusercontent.com/willbowditch/ratpack/master/R/ratpack.R')
   import_user_packages()
   #All installed packages will now be accessable within the packrat session

开发结束时重置

   packrat::set_opts(external.packages=NULL)
   packrat::snapshot()
   packrat::restore() #This step will install the packages if they're not in the cache

最简单的选项

介于这些选项之间的某个地方可能最有意义 - 用户整理他们的大型但常用的包列表以进行符号链接(即 packrat::set_opts(external.packages=c('tidyverse', 'data.table')) ),然后忍受在项目上逐个项目安装较小的包基础。