R无法打开连接`

R Cannot Open Connection`

我正在从 NBA.com API 中抓取一些镜头数据。我用的url是

url = "stats.nba.com/stats/shotchartdetail?CFID=33&CFPARAMS=2017-18&ContextFilter=&ContextMeasure=FGA&DateFrom=&DateTo=&GameID=&GameSegment=&LastNGames=0&LeagueID=00&Location=&MeasureType=Base&Month=0&OpponentTeamID=0&Outcome=&PaceAdjust=N&PerMode=PerGame&Period=0&PlayerID=101107&PlusMinus=N&PlayerPosition=&Rank=N&RookieYear=&Season=2017-18&SeasonSegment=&SeasonType=Regular%20Season&TeamID=0&VsConference=&VsDivision="

通过复制并粘贴到您的浏览器中,可以轻松验证此网站是否存在。但是,当我输入行

data = rjson::fromJSON(file = url)

我收到错误:文件错误(con,"r"):无法打开连接...HTTP 状态为“403 禁止访问”。

我曾尝试将 http 和 https 添加到 url 但无济于事。为什么 R 不读取显然存在的 url?

概述

您需要将数据下载到 ,然后将 .json 文件导入 fromJSON()。我已经展示了如何提取列表对象 marvin.williams.shot.data:

中包含的两个数据框
  1. Marvin William的个人2017-2018投篮数据;和
  2. 2017-2018赛季NBA联盟平均投篮数据。

可重现的例子

# load necessary packages
library( jsonlite )

# load necessary data
download.file( url = "http://stats.nba.com/stats/shotchartdetail?CFID=33&CFPARAMS=2017-18&ContextFilter=&ContextMeasure=FGA&DateFrom=&DateTo=&GameID=&GameSegment=&LastNGames=0&LeagueID=00&Location=&MeasureType=Base&Month=0&OpponentTeamID=0&Outcome=&PaceAdjust=N&PerMode=PerGame&Period=0&PlayerID=101107&PlusMinus=N&PlayerPosition=&Rank=N&RookieYear=&Season=2017-18&SeasonSegment=&SeasonType=Regular%20Season&TeamID=0&VsConference=&VsDivision="
               , destfile = "stats_nba.json" )

# transfrom into data frame
marvin.williams.shot.data <- 
  fromJSON( txt = "stats_nba.json" )

# view results
lapply( X = marvin.williams.shot.data, FUN = class)
# $resource
# [1] "character"
# 
# $parameters
# [1] "list"
# 
# $resultSets
# [1] "data.frame"

# transfrom the matrix into a data frame
player.shotchart.df <-
  as.data.frame( marvin.williams.shot.data$resultSets$rowSet[[1]]
                 , stringsAsFactors = FALSE )

# assign colnames
colnames( player.shotchart.df ) <-
  marvin.williams.shot.data$resultSets$headers[[1]]

# view results
dim( player.shotchart.df ) # [1] 563  24

# transfrom the matrix into a data frame
league.average.df <-
  as.data.frame( marvin.williams.shot.data$resultSets$rowSet[[2]]
                 , stringsAsFactors = FALSE )

# assign colnames
colnames( league.average.df ) <-
  marvin.williams.shot.data$resultSets$headers[[2]]

# view results
dim( league.average.df ) # [1] 20  7

# end of script #