R 中 GitHub 版本的下载计数

Download count of GitHub release in R

我正在尝试从 Google 获取 download count of a public repo using the GitHub API and R v3.1.2. Using the public samples repo 我有以下内容:

library(jsonlite)
library(httr)

url <- "https://api.github.com/repos/googlesamples/google-services/downloads"
response <- GET(url)
json <- content(response, "text")
json <- fromJSON(json)

print(json)

但是,我注意到 json returns 是一个空列表。是因为这个 public 回购没有发布吗?目标是确定此存储库已被 public 或任何其他 public 存储库下载了多少次。这甚至可能吗?

Get a single release”中提到的API是:

GET /repos/:owner/:repo/releases/:id

如评论所述,您需要将其应用于 repo 的发布。
例如,这里是 python gist (by Philip Hansen - Hanse00) 提取 download_count.
(不是在 R 中,而是为了展示如何使用 /repos/:owner/:repo/releases/:id url)

摘录:

#Iterate through every tag
search_point = 0
while formatted_string.find("tag_name", search_point) != -1:
    #Find where in the string the tag and download texts are
    find_point = formatted_string.find("tag_name", search_point)
    download_point = formatted_string.find("download_count", find_point)

这是一个 even shorter script by Brad Chapman chapmanb, using sigmavirus24/github3.py(Python 库,用于与 GitHub APIv3 接口):

#!/usr/bin/env python
"""Get download stats for releases from GitHub.
Needs development version of github3.
pip install github3
pip install git+https://github.com/sigmavirus24/github3.py.git
"""
import github3

repo = github3.repository("chapmanb", "bcbio.variation")
for release in repo.iter_releases():
    for asset in release.iter_assets():
        print release.name, asset.name, asset.download_count

(你有many more examples

旧的 Github 下载计数 have been deprecated 并且似乎不再有效。您可以从发布中获取下载计数,但这确实需要一些操作:

library(jsonlite)
library(httr)

url <- "https://api.github.com/repos/allenluce/mmap-object/releases"
response <- GET(url)
json <- content(response, "text")
json <- fromJSON(json)
print(Reduce("+", lapply(json$assets, function(x) sum(x$download_count))))

有一些注意事项:

  1. 回购必须有版本。
  2. 版本必须有文件
  3. 没有API获取克隆你的repo的人数。

Github 允许您计算已下载的已发布文件的数量,仅此而已。您用作示例的 google-services 存储库既没有版本也没有文件!