将 rvest 与 drake 一起使用:外部指针无效错误
Using rvest with drake: external pointer is not valid error
当我第一次运行下面的代码时,一切正常。但是当我在 html_file %>%...
命令中更改某些内容时,例如评论 tolower()
,我收到以下错误:
Error: target title failed.
diagnose(title)error$message:
external pointer is not valid
diagnose(title)error$calls:
1. └─html_file %>% html_nodes("h2") %>% html_text()
代码:
library(rvest)
library(drake)
some_string <- '
<div class="main">
<h2>A</h2>
<div class="route">X</div>
</div>
'
html_file <- read_html(some_string)
title <- html_file %>%
html_nodes("h2") %>%
html_text()
plan <- drake_plan(
html_file = read_html(some_string),
title = html_file %>%
html_nodes("h2") %>%
html_text() %>%
tolower()
)
make(plan)
我找到了两个可能的解决方案,但我对它们不感兴趣。
1. 将 drake_plan
中的两个步骤合二为一。
2. 按照建议使用 xml2::write_html()
和 xml2::read_html()
here.
有没有更好的方法来解决它?
P.S。问题已经讨论 , Rstudio forum, and on github。
默认情况下,drake
将目标保存为 RDS 文件(其他选项 here). So https://github.com/tidyverse/rvest/issues/181#issuecomment-395064636, which you brought up, is exactly the problem. I like (1) because text is compatible with RDS. Speaking broadly, it is up to the user to choose good targets compatible with drake
's data storage system. See https://books.ropensci.org/drake/plans.html#how-to-choose-good-targets for a discussion and links to similar issues. But you want to go with (2), you could return the file path to your HTML file from within a dynamic file。
当我第一次运行下面的代码时,一切正常。但是当我在 html_file %>%...
命令中更改某些内容时,例如评论 tolower()
,我收到以下错误:
Error: target title failed.
diagnose(title)error$message:
external pointer is not valid
diagnose(title)error$calls:
1. └─html_file %>% html_nodes("h2") %>% html_text()
代码:
library(rvest)
library(drake)
some_string <- '
<div class="main">
<h2>A</h2>
<div class="route">X</div>
</div>
'
html_file <- read_html(some_string)
title <- html_file %>%
html_nodes("h2") %>%
html_text()
plan <- drake_plan(
html_file = read_html(some_string),
title = html_file %>%
html_nodes("h2") %>%
html_text() %>%
tolower()
)
make(plan)
我找到了两个可能的解决方案,但我对它们不感兴趣。
1. 将 drake_plan
中的两个步骤合二为一。
2. 按照建议使用 xml2::write_html()
和 xml2::read_html()
here.
有没有更好的方法来解决它?
P.S。问题已经讨论
默认情况下,drake
将目标保存为 RDS 文件(其他选项 here). So https://github.com/tidyverse/rvest/issues/181#issuecomment-395064636, which you brought up, is exactly the problem. I like (1) because text is compatible with RDS. Speaking broadly, it is up to the user to choose good targets compatible with drake
's data storage system. See https://books.ropensci.org/drake/plans.html#how-to-choose-good-targets for a discussion and links to similar issues. But you want to go with (2), you could return the file path to your HTML file from within a dynamic file。