R 用 rvest 抓取 table，选择器不工作

Question

我正在尝试抓取下面 link 中的 table：

https://www.pgatour.com/university/full-ranking.html

我希望输出如下所示：

Rank     Player           University
1        Pierceson Coody  University of Texas
2        Sam Bennett      Texas A&M

table 上列的 td class 是“排名”、“玩家”、“名称”，当我尝试将这些设置为我的选择器时，我得到的输出是'character (empty)' 在 Rstudio 的值部分。

pga_url <- 'https://www.pgatour.com/university/full-ranking.html'

pgaU <- read_html(pga_url)
select <- '.name'

p <- html_nodes(pgaU,select) %>%
  html_text ()

数据在 HTML 标签下，所以如果我使用写入函数或选择器，请注意。使用维基百科作为示例来抓取 tables 的文章没有帮助。我以前没有在 Inspect 上使用过复制 element/xpath/selector 选项，但还没有弄清楚如何让它工作。

Answer 1

您可以直接访问该元素：

jsonlite::fromJSON("https://statdata-api-prod.pgatour.com/api/clientfile/PGATourUniversityRankings?format=json&week=39")

R 用 rvest 抓取 table，选择器不工作

R scraping a table with rvest, selectors not working

r

rvest