CRAN(或其任何亲属)是否有 API?

Does CRAN (or any of its relatives) have an API?

我有兴趣检索有关 R 包的机器可读元信息。

例如,当我去 CRAN 时,我可以在下载之前看到关于包的简短描述:https://cran.r-project.org/web/packages/MASS/

我找不到任何方法从 CRAN 服务器检索与 HTML 不同的输出。我想避免解析 HTML,而是以某种方式以更方便的格式(例如,JSON)检索有关包的元信息。

我看到每个 R 包(至少据我所知)在其源代码包(文件名为 DESCRIPTION)中都有类似 yaml(?)的描述文本。然而,到目前为止我只能在 tar 档案中找到这种描述,这意味着我必须先下载包才能访问它的描述。

这里是来自 MASS 包的 DESCRIPTION 的示例:

Package: MASS
Priority: recommended
Version: 7.3-55
Date: 2022-01-12
Revision: $Rev: 3559 $
Depends: R (>= 3.3.0), grDevices, graphics, stats, utils
Imports: methods
Suggests: lattice, nlme, nnet, survival
Authors@R: c(person("Brian", "Ripley", role = c("aut", "cre", "cph"),
                    email = "ripley@stats.ox.ac.uk"),
         person("Bill", "Venables", role = "ctb"),
         person(c("Douglas", "M."), "Bates", role = "ctb"),
         person("Kurt", "Hornik", role = "trl",
                     comment = "partial port ca 1998"),
         person("Albrecht", "Gebhardt", role = "trl",
                     comment = "partial port ca 1998"),
         person("David", "Firth", role = "ctb"))
Description: Functions and datasets to support Venables and Ripley,
  "Modern Applied Statistics with S" (4th edition, 2002).
Title: Support Functions and Datasets for Venables and Ripley's MASS
LazyData: yes
ByteCompile: yes
License: GPL-2 | GPL-3
URL: http://www.stats.ox.ac.uk/pub/MASS4/
Contact: <MASS@stats.ox.ac.uk>
NeedsCompilation: yes
Packaged: 2022-01-13 05:06:37 UTC; ripley
Author: Brian Ripley [aut, cre, cph],
  Bill Venables [ctb],
  Douglas M. Bates [ctb],
  Kurt Hornik [trl] (partial port ca 1998),
  Albrecht Gebhardt [trl] (partial port ca 1998),
  David Firth [ctb]
Maintainer: Brian Ripley <ripley@stats.ox.ac.uk>
Repository: CRAN
Date/Publication: 2022-01-13 08:05:04 UTC

有什么建议可以直接以机器可读且方便的形式获取吗?

我试图查找它,但搜索引擎到目前为止没有给我带来任何有用的结果。

编辑/澄清: 我正在寻找一个不依赖于 R 的解决方案,而是一个与所用框架/语言无关的网络 API用于元数据检索。

一个可接受的解决方案是 METACRAN API,可在此处获得: https://crandb.r-pkg.org/

tools::CRAN_package_db()有你想要的所有信息吗? (有关讨论,请参阅 here

> dd <- tools::CRAN_package_db()
> names(dd)
 [1] "Package"                 "Version"                
 [3] "Priority"                "Depends"                
 [5] "Imports"                 "LinkingTo"              
 [7] "Suggests"                "Enhances"               
 [9] "License"                 "License_is_FOSS"        
[11] "License_restricts_use"   "OS_type"                
[13] "Archs"                   "MD5sum"                 
[15] "NeedsCompilation"        "Additional_repositories"
[17] "Author"                  "Authors@R"              
[19] "Biarch"                  "BugReports"             
[21] "BuildKeepEmpty"          "BuildManual"            
[23] "BuildResaveData"         "BuildVignettes"         
[25] "Built"                   "ByteCompile"            
[27] "Classification/ACM"      "Classification/ACM-2012"
[29] "Classification/JEL"      "Classification/MSC"     
[31] "Classification/MSC-2010" "Collate"                
[33] "Collate.unix"            "Collate.windows"        
[35] "Contact"                 "Copyright"              
[37] "Date"                    "Description"            
[39] "Encoding"                "KeepSource"             
[41] "Language"                "LazyData"               
[43] "LazyDataCompression"     "LazyLoad"               
[45] "MailingList"             "Maintainer"             
[47] "Note"                    "Packaged"               
[49] "RdMacros"                "StagedInstall"          
[51] "SysDataCompression"      "SystemRequirements"     
[53] "Title"                   "Type"                   
[55] "URL"                     "UseLTO"                 
[57] "VignetteBuilder"         "ZipData"                
[59] "Published"               "Path"                   
[61] "X-CRAN-Comment"          "Reverse depends"        
[63] "Reverse imports"         "Reverse linking to"     
[65] "Reverse suggests"        "Reverse enhances"       

您可以下载https://cloud.r-project.org/src/contrib/PACKAGES.gz (or even in uncompressed form https://cloud.r-project.org/src/contrib/PACKAGES)。它包含有关 DCF 格式的所有当前可用包的信息,使用 DESCRIPTION 文件中的一些字段和其他一些字段。

您不需要使用 cloud.r-project.org,任何 CRAN 镜像都可以。