Rvest 将最终字符串转换为多行
Rvest Turn Final String Into Multiple Rows
我已经使用 rvest 从一场足球比赛中抓取评论 ESPNFC.co.uk,但我正在努力获得我需要的最终输出。
library("rvest")
library("xlsx")
espnfc<-html("http://www.espnfc.co.uk/commentary/422421/commentary.html")
commentary<-espnfc %>%
html_nodes("#convo-window") %>%
html_text()
commentary <- gsub ( "\n", "", commentary)
commentary <- gsub ( "\r", "", commentary)
commentary <- gsub ( "\t", "", commentary)
最终输出是一个巨大的字符串,但是我希望每分钟的动作是数据帧中的一行,例如:
"90'Second Half ends, Liverpool 2, Sunderland 2."
"90'Attempt blocked. Adam Johnson (Sunderland) right footed shot from outside the box is blocked. Assisted by Patrick van Aanholt."
"90'Attempt missed. Jordon Ibe (Liverpool) right footed shot from outside the box is close, but misses to the left. Assisted by Mamadou Sakho."
"90'Lucas Leiva (Liverpool) wins a free kick in the attacking half."
我该怎么做?
使用 css 选择器会让您的生活更轻松
espnfc<-html("http://www.espnfc.co.uk/commentary/422421/commentary.html")
commentary<-espnfc %>%
html_nodes(".comment p") %>%
html_text()
minute<-espnfc %>%
html_nodes(".timestamp p") %>%
html_text()
df<-data.frame(minute=minute,commentary=commentary)
我已经使用 rvest 从一场足球比赛中抓取评论 ESPNFC.co.uk,但我正在努力获得我需要的最终输出。
library("rvest")
library("xlsx")
espnfc<-html("http://www.espnfc.co.uk/commentary/422421/commentary.html")
commentary<-espnfc %>%
html_nodes("#convo-window") %>%
html_text()
commentary <- gsub ( "\n", "", commentary)
commentary <- gsub ( "\r", "", commentary)
commentary <- gsub ( "\t", "", commentary)
最终输出是一个巨大的字符串,但是我希望每分钟的动作是数据帧中的一行,例如:
"90'Second Half ends, Liverpool 2, Sunderland 2."
"90'Attempt blocked. Adam Johnson (Sunderland) right footed shot from outside the box is blocked. Assisted by Patrick van Aanholt."
"90'Attempt missed. Jordon Ibe (Liverpool) right footed shot from outside the box is close, but misses to the left. Assisted by Mamadou Sakho."
"90'Lucas Leiva (Liverpool) wins a free kick in the attacking half."
我该怎么做?
使用 css 选择器会让您的生活更轻松
espnfc<-html("http://www.espnfc.co.uk/commentary/422421/commentary.html")
commentary<-espnfc %>%
html_nodes(".comment p") %>%
html_text()
minute<-espnfc %>%
html_nodes(".timestamp p") %>%
html_text()
df<-data.frame(minute=minute,commentary=commentary)