在 drake 中使用 code_to_plan 和 target(..., format = "fst")
Using code_to_plan and target(..., format = "fst") in drake
我非常喜欢在构建 drake 计划时使用 code_to_plan
功能。我也真的使用 target(..., format = "fst")
来处理大文件。但是,我正在努力将这两个工作流程结合起来。例如,如果我有这个 _drake.R
文件:
# Data --------------------------------------------------------------------
data_plan = code_to_plan("code/01-data/data.R")
join_plan = code_to_plan("code/01-data/merging.R")
# Cleaning ----------------------------------------------------------------
cleaning_plan = code_to_plan("code/02-cleaning/remove_na.R")
# Model -------------------------------------------------------------------
model_plan = code_to_plan("code/03-model/model.R")
# Combine Plans
dplan = bind_plans(
data_plan,
join_plan,
cleaning_plan,
model_plan
)
config <- drake_config(dplan)
当使用 r_make(r_args = list(show = TRUE))
调用时,这工作正常
据我了解,虽然 target
只能在 drake_plan
中使用。如果我尝试这样的事情:
dplan2 <- drake_plan(full_plan = target(dplan, format = "fst"))
config <- drake_config(dplan2)
我收到这样的 r_make
错误:
target full_plan
Error in fst::write_fst(x = value$value, path = tmp) :
Unknown type found in column.
In addition: Warning message:
You selected fst format for target full_plan, so drake will convert it from class c("drake_plan", "tbl_df", "tbl", "data.frame") to a plain data frame.
Error:
-->
in process 18712
See .Last.error.trace
for a stack trace.
所以最终我的问题是,当您使用 code_to_plan
时,在哪里指定 special data formats for targets?
编辑
使用@landau 的有用建议,我定义了这个函数:
add_target_format <- function(plan) {
# Get a list of named commands.
commands <- plan$command
names(commands) <- plan$target
# Turn it into a good plan.
do.call(drake_plan, commands)
}
这样就可以了:
dplan = bind_plans(
data_plan,
join_plan,
cleaning_plan,
model_plan
) %>%
add_target_format()
可以,但不方便。这是一个解决方法。
writeLines(
c(
"x <- small_data()",
"y <- target(large_data(), format = \"fst\")"
),
"script.R"
)
cat(readLines("script.R"), sep = "\n")
#> x <- small_data()
#> y <- target(large_data(), format = "fst")
library(drake)
# Produces a plan, but does not process target().
bad_plan <- code_to_plan("script.R")
bad_plan
#> # A tibble: 2 x 2
#> target command
#> <chr> <expr>
#> 1 x small_data()
#> 2 y target(large_data(), format = "fst")
# Get a list of named commands.
commands <- bad_plan$command
names(commands) <- bad_plan$target
# Turn it into a good plan.
good_plan <- do.call(drake_plan, commands)
good_plan
#> # A tibble: 2 x 3
#> target command format
#> <chr> <expr> <chr>
#> 1 x small_data() <NA>
#> 2 y large_data() fst
由 reprex package (v0.3.0)
创建于 2019-12-18
我非常喜欢在构建 drake 计划时使用 code_to_plan
功能。我也真的使用 target(..., format = "fst")
来处理大文件。但是,我正在努力将这两个工作流程结合起来。例如,如果我有这个 _drake.R
文件:
# Data --------------------------------------------------------------------
data_plan = code_to_plan("code/01-data/data.R")
join_plan = code_to_plan("code/01-data/merging.R")
# Cleaning ----------------------------------------------------------------
cleaning_plan = code_to_plan("code/02-cleaning/remove_na.R")
# Model -------------------------------------------------------------------
model_plan = code_to_plan("code/03-model/model.R")
# Combine Plans
dplan = bind_plans(
data_plan,
join_plan,
cleaning_plan,
model_plan
)
config <- drake_config(dplan)
当使用 r_make(r_args = list(show = TRUE))
据我了解,虽然 target
只能在 drake_plan
中使用。如果我尝试这样的事情:
dplan2 <- drake_plan(full_plan = target(dplan, format = "fst"))
config <- drake_config(dplan2)
我收到这样的 r_make
错误:
target full_plan Error in fst::write_fst(x = value$value, path = tmp) : Unknown type found in column. In addition: Warning message: You selected fst format for target full_plan, so drake will convert it from class c("drake_plan", "tbl_df", "tbl", "data.frame") to a plain data frame.
Error: --> in process 18712
See
.Last.error.trace
for a stack trace.
所以最终我的问题是,当您使用 code_to_plan
时,在哪里指定 special data formats for targets?
编辑
使用@landau 的有用建议,我定义了这个函数:
add_target_format <- function(plan) {
# Get a list of named commands.
commands <- plan$command
names(commands) <- plan$target
# Turn it into a good plan.
do.call(drake_plan, commands)
}
这样就可以了:
dplan = bind_plans(
data_plan,
join_plan,
cleaning_plan,
model_plan
) %>%
add_target_format()
可以,但不方便。这是一个解决方法。
writeLines(
c(
"x <- small_data()",
"y <- target(large_data(), format = \"fst\")"
),
"script.R"
)
cat(readLines("script.R"), sep = "\n")
#> x <- small_data()
#> y <- target(large_data(), format = "fst")
library(drake)
# Produces a plan, but does not process target().
bad_plan <- code_to_plan("script.R")
bad_plan
#> # A tibble: 2 x 2
#> target command
#> <chr> <expr>
#> 1 x small_data()
#> 2 y target(large_data(), format = "fst")
# Get a list of named commands.
commands <- bad_plan$command
names(commands) <- bad_plan$target
# Turn it into a good plan.
good_plan <- do.call(drake_plan, commands)
good_plan
#> # A tibble: 2 x 3
#> target command format
#> <chr> <expr> <chr>
#> 1 x small_data() <NA>
#> 2 y large_data() fst
由 reprex package (v0.3.0)
创建于 2019-12-18