将数据抓取到 R
Scrape Data into R
我目前正在尝试将 Player Standard Stats table 抓取到 R 中,但无法获得正确的 table。
html_link <- "https://fbref.com/en/comps/9/stats/Premier-League-Stats#stats_standard::1"
"https://fbref.com/en/comps/9/stats/Premier-League-Stats#stats_standard::1"
df <- html_link %>%
xml2::read_html() %>%
rvest::html_nodes("table") %>%
rvest::html_table(fill = T)
link 向剪贴板提供了一个副本 link,所以我试图使用它 link 并将数据抓取进去,但看起来我没有得到正确的结果。有谁知道如何在 R 中自动执行此操作而无需下载 CSV 文件?
谢谢。
您可以在 table...
上使用“嵌入 link”
url <- "https://widgets.sports-reference.com/wg.fcgi?css=1&site=fb&url=%2Fen%2Fcomps%2F9%2Fstats%2FPremier-League-Stats&div=div_stats_standard"
f <- url %>%
xml2::read_html() %>%
rvest::html_nodes('table') %>%
html_table() %>%
.[[1]]
> head(f)
1 Rk Player Nation Pos Squad Age Born
2 1 Patrick van Aanholt nl NED DF Crystal Palace 30-170 1990
3 2 Tammy Abraham eng ENG FW Chelsea 23-136 1997
4 3 Che Adams eng ENG FW Southampton 24-217 1996
5 4 Tosin Adarabioyo eng ENG DF Fulham 23-144 1997
6 5 Adrián es ESP GK Liverpool 34-043 1987
Playing Time Playing Time Playing Time Playing Time Performance
1 MP Starts Min 90s Gls
2 14 13 1,144 12.7 0
3 18 10 957 10.6 6
4 22 20 1,735 19.3 4
5 19 19 1,710 19.0 0
6 2 2 180 2.0 0
Performance Performance Performance Performance Performance
1 Ast G-PK PK PKatt CrdY
2 1 0 0 0 1
3 1 6 0 0 0
4 4 4 0 0 1
5 0 0 0 0 1
6 0 0 0 0 0
Performance Per 90 Minutes Per 90 Minutes Per 90 Minutes
1 CrdR Gls Ast G+A
2 0 0.00 0.08 0.08
3 0 0.56 0.09 0.66
4 0 0.21 0.21 0.41
5 0 0.00 0.00 0.00
6 0 0.00 0.00 0.00
Per 90 Minutes Per 90 Minutes Expected Expected Expected Expected
1 G-PK G+A-PK xG npxG xA npxG+xA
2 0.00 0.08 0.8 0.8 0.8 1.6
3 0.56 0.66 5.5 5.5 0.9 6.3
4 0.21 0.41 5.1 5.1 4.3 9.4
5 0.00 0.00 0.8 0.8 0.1 0.9
6 0.00 0.00 0.0 0.0 0.0 0.0
Per 90 Minutes Per 90 Minutes Per 90 Minutes Per 90 Minutes
1 xG xA xG+xA npxG
2 0.06 0.06 0.12 0.06
3 0.51 0.08 0.60 0.51
4 0.26 0.22 0.49 0.26
5 0.04 0.01 0.05 0.04
6 0.00 0.00 0.00 0.00
Per 90 Minutes
1 npxG+xA Matches
2 0.12 Matches
3 0.60 Matches
4 0.49 Matches
5 0.05 Matches
6 0.00 Matches
我目前正在尝试将 Player Standard Stats table 抓取到 R 中,但无法获得正确的 table。
html_link <- "https://fbref.com/en/comps/9/stats/Premier-League-Stats#stats_standard::1"
"https://fbref.com/en/comps/9/stats/Premier-League-Stats#stats_standard::1"
df <- html_link %>%
xml2::read_html() %>%
rvest::html_nodes("table") %>%
rvest::html_table(fill = T)
link 向剪贴板提供了一个副本 link,所以我试图使用它 link 并将数据抓取进去,但看起来我没有得到正确的结果。有谁知道如何在 R 中自动执行此操作而无需下载 CSV 文件?
谢谢。
您可以在 table...
上使用“嵌入 link”url <- "https://widgets.sports-reference.com/wg.fcgi?css=1&site=fb&url=%2Fen%2Fcomps%2F9%2Fstats%2FPremier-League-Stats&div=div_stats_standard"
f <- url %>%
xml2::read_html() %>%
rvest::html_nodes('table') %>%
html_table() %>%
.[[1]]
> head(f)
1 Rk Player Nation Pos Squad Age Born
2 1 Patrick van Aanholt nl NED DF Crystal Palace 30-170 1990
3 2 Tammy Abraham eng ENG FW Chelsea 23-136 1997
4 3 Che Adams eng ENG FW Southampton 24-217 1996
5 4 Tosin Adarabioyo eng ENG DF Fulham 23-144 1997
6 5 Adrián es ESP GK Liverpool 34-043 1987
Playing Time Playing Time Playing Time Playing Time Performance
1 MP Starts Min 90s Gls
2 14 13 1,144 12.7 0
3 18 10 957 10.6 6
4 22 20 1,735 19.3 4
5 19 19 1,710 19.0 0
6 2 2 180 2.0 0
Performance Performance Performance Performance Performance
1 Ast G-PK PK PKatt CrdY
2 1 0 0 0 1
3 1 6 0 0 0
4 4 4 0 0 1
5 0 0 0 0 1
6 0 0 0 0 0
Performance Per 90 Minutes Per 90 Minutes Per 90 Minutes
1 CrdR Gls Ast G+A
2 0 0.00 0.08 0.08
3 0 0.56 0.09 0.66
4 0 0.21 0.21 0.41
5 0 0.00 0.00 0.00
6 0 0.00 0.00 0.00
Per 90 Minutes Per 90 Minutes Expected Expected Expected Expected
1 G-PK G+A-PK xG npxG xA npxG+xA
2 0.00 0.08 0.8 0.8 0.8 1.6
3 0.56 0.66 5.5 5.5 0.9 6.3
4 0.21 0.41 5.1 5.1 4.3 9.4
5 0.00 0.00 0.8 0.8 0.1 0.9
6 0.00 0.00 0.0 0.0 0.0 0.0
Per 90 Minutes Per 90 Minutes Per 90 Minutes Per 90 Minutes
1 xG xA xG+xA npxG
2 0.06 0.06 0.12 0.06
3 0.51 0.08 0.60 0.51
4 0.26 0.22 0.49 0.26
5 0.04 0.01 0.05 0.04
6 0.00 0.00 0.00 0.00
Per 90 Minutes
1 npxG+xA Matches
2 0.12 Matches
3 0.60 Matches
4 0.49 Matches
5 0.05 Matches
6 0.00 Matches