pyGithub最大使用API调用率

Question

我正在尝试使用 pyGithub 库访问 github 的 v3 API。虽然这个库使用起来很简单。我发现文档非常模糊。

下面我成功获取了一个文件的内容，其中包含文件路径及其 sha。我的最终目标是将我的 API 调用从 3 次减少到仅 1 次，因为我想在一小时内使用完整的 5000 API 次调用。

from github import Github
gh = Github(access_token) # I supply an access token here.
user = gh.get_user(owner_name) # This takes 1 API call
repo = user.get_repo(repo_name) # This takes 1 API call


file = repo.get_file_contents(filename, ref=sha) # This takes 1 API call

有谁知道我如何将存储库和所有者名称传递给 get_file_contents() 或我可以用来实现此目的的类似函数？

感谢任何帮助。

Answer 1

Does anyone know how I can pass the repo and owner name to get_file_contents()

鉴于 current implementation of get_file_contents，它预计：

GithubObject（花费 API 调用）
或一个字符串（不花费 API 次调用）

但两者都依赖于 class Repository，这确实涉及 API 个调用。
因此，如果您可以使您的流程长期存在，能够在单个执行会话中重用该存储库，那将是最好的。

但是，如果您必须从多个存储库中获取文件，那将无济于事。

Answer 2

您可以使用格式为 'owner_name/repo_name'

的 get_repo() 将其从 3 个 API 调用减少到仅 2 个

from github import Github
gh = Github(access_token) # I supply an access token here.
repo = gh.get_repo(owner_name+'/'+repo_name) # This takes 1 API call

file = repo.get_file_contents(filename, ref=sha) # This takes 1 API call

这里提一下，方便以后参考。实际上，我最终使用了请求库并形成了我自己的 api 调用。

像这样：

import requests
# Import python's base64 decoder
from base64 import b64decode as decode

def GET(owner_repo,filename,sha,access_token):
    # Supply Headers
    headers = {'content-type': 'application/json', 'Authorization':'token '+access_token}
    # This function is stable so long as the github API structure does not change. 
    # Also I am using the previously mentioned combo of owner/repo.
    url = 'https://api.github.com/repos/%s/contents/%s?ref=%s' % (owner_repo, filename, sha)
    # Let's stay within the API rate limits
    url_rate_limit = 'https://api.github.com/rate_limit'
    r = requests.get(url_rate_limit, headers=headers)
    rate_limit = int(r.json()['resources']['core']['remaining'])
    if(rate_limit > 0):
        # The actual request
        r = requests.get(url, headers=headers)
        # Note: you will need to handle the any error codes yourself. 
        # I am just safe checking for '200' OK
        if(r.status_code == 200):
            content = r.json()['content']
            # Decode base64 content
            decoded_content = decode(content)
            return decoded_content

我在 MIT 许可下许可上述代码。

Answer 3

GitHub API支持conditional requests，缓存命中不计入速率限制：

Making a conditional request and receiving a 304 response does not count against your Rate Limit, so we encourage you to use it whenever possible.

但是，PyGithub 没有实现缓存：

https://github.com/PyGithub/PyGithub/issues/585

然而，在 GitHub3:

中是可能的

https://github.com/sigmavirus24/github3.py/issues/75#issuecomment-128345063

有一些包可以为请求添加缓存：

有requests-cache, which has a global patching mechanism, but it does not yet support HTTP preconditions
有cachecontrol，它没有全局修补机制，但我通过修补一些内部结构设法将它与 PyGithub 集成：

gh = github.Github(token)
class CachingConnectionClass(gh._Github__requester._Requester__connectionClass):
    def __init__(self, *args, **kwargs):
        super(gh._Github__requester._Requester__connectionClass, self).__init__(*args, **kwargs)
        self.session = CacheControl(self.session,
                                    cache=FileCache('.github-cache'))
gh._Github__requester._Requester__connectionClass = CachingConnectionClass

pyGithub最大使用API调用率

pyGithub maximum use of API call rate

python

github-api

pygithub