如何在 R drake 中组合和过滤动态文件目标?

How to combine and filter dynamic file targets in R drake?

我在我的 drake 计划中创建了一组文件。我想将这些文件的一个子集复制到另一个位置。

下面的代码几乎可以做到这一点。但是,在获取我要复制的文件目标子集后,drake 对文件更改的依赖性跟踪丢失了。

如何在不丢失 drake 依赖跟踪的情况下 combine/subset 动态文件目标?

copy_file <- function(file) {
  file_copy <- paste0(file, "_copy")
  file.copy(from = file, to = file_copy, overwrite = TRUE)
  file_copy
}

herb_1_a <- "parsley"
plan <- drake::drake_plan(
  file_1 = target(
    {
      writeLines(herb_1_a, "file_1_a") # Second run
      writeLines("sage", "file_1_b")
      c("file_1_a", "file_1_b")
    },
    format = "file"
  ),

  file_2 = target(
    {
      writeLines("rosemary", "file_2_a")
      writeLines("thyme", "file_2_b")
      c("file_2_a", "file_2_b")
    },
    format = "file"
  ),

  files_to_copy = str_subset(
    c(file_1, file_2),
    "_a$"
  ),

  file_copies = target(
    copy_file(files_to_copy),
    dynamic = map(files_to_copy),
    format = "file"
  )
)

drake::make(plan)
#> ▶ target file_2
#> ▶ target file_1
#> ▶ target files_to_copy
#> ▶ dynamic file_copies
#> > subtarget file_copies_5e57e9ee
#> > subtarget file_copies_ae26ecf9
#> ■ finalize file_copies
readLines("file_1_a")
#> [1] "parsley"
readLines("file_1_a_copy")
#> [1] "parsley"
herb_1_a <- 'banana'
drake::make(plan)
#> ▶ target file_1
#> ▶ target files_to_copy
readLines("file_1_a")
#> [1] "banana"
readLines("file_1_a_copy") # I want this banana
#> [1] "parsley"

reprex package (v0.3.0)

于 2020-09-24 创建

我认为解决这个问题的方法是在复制步骤之前创建一组 dynamically-mapped 动态输入文件。也就是说,files_to_copy 应该是动态文件的动态目标。素描:

plan <- drake::drake_plan(
  file_1 = target(
    {
      writeLines(herb_1_a, "file_1_a") # Second run
      writeLines("sage", "file_1_b")
      c("file_1_a", "file_1_b")
    },
    format = "file"
  ),
  
  file_2 = target(
    {
      writeLines("rosemary", "file_2_a")
      writeLines("thyme", "file_2_b")
      c("file_2_a", "file_2_b")
    },
    format = "file"
  ),
  
  files_to_copy_group = str_subset(
    c(file_1, file_2),
    "_a$"
  ),
  
  files_to_copy = target(
    files_to_copy_group,
    dynamic = map(files_to_copy_group),
    format = "file"
  ),
  
  file_copies = target(
    copy_file(files_to_copy),
    dynamic = map(files_to_copy),
    format = "file"
  )
)