如何确定 GitHub 上的哪些分叉在前?
How to determine which forks on GitHub are ahead?
有时,我正在使用的某个软件(例如 linkchecker)的原始 GitHub 存储库很少或根本没有开发,而已经创建了很多分支(在这种情况:142,在撰写本文时)。
对于每个分叉,我想知道:
- 它有哪些分支在原始主分支之前提交
并且对于每个这样的分支:
- 比原来的提交次数多
- 它落后了多少次提交
, but I don't want to do this manually for each fork, I just want a CSV file with the results for all forks. How can this be scripted? The GitHub API can list the forks,但我看不出如何将叉子与它进行比较。依次克隆每个fork并在本地进行比较似乎有点粗糙。
有完全相同的渴望并写了一个刮板,它获取打印在呈现的 HTML 中的信息用于叉子:https://github.com/hbbio/forkizard
肯定不完美,只是临时解决方案。
active-forks 不完全符合我的要求,但它接近并且非常易于使用。
派对迟到了 - 我认为这是我第二次完成此 SO post 所以我将分享我的基于 js 的解决方案(我最终通过获取并搜索 html 页)。
您可以从中创建一个 bookmarklet,或者简单地将整个内容粘贴到控制台中。适用于 chromium 和 firefox:
编辑:如果页面上有超过 10 个左右的分叉,您可能会因为抓取太快而被锁定(网络中的 429 请求太多)。使用 async / await 代替:
javascript:(async () => {
/* while on the forks page, collect all the hrefs and pop off the first one (original repo) */
const forks = [...document.querySelectorAll('div.repo a:last-of-type')].map(x => x.href).slice(1);
for (const fork of forks) {
/* fetch the forked repo as html, search for the "This branch is [n commits ahead,] [m commits behind]", print it to console */
await fetch(fork)
.then(x => x.text())
.then(html => console.log(`${fork}: ${html.match(/This branch is.*/).pop().replace('This branch is ', '')}`))
.catch(console.error);
}
})();
或者你可以分批进行,但是很容易被锁在门外
javascript:(async () => {
/* while on the forks page, collect all the hrefs and pop off the first one (original repo) */
const forks = [...document.querySelectorAll('div.repo a:last-of-type')].map(x => x.href).slice(1);
getfork = (fork) => {
return fetch(fork)
.then(x => x.text())
.then(html => console.log(`${fork}: ${html.match(/This branch is.*/).pop().replace('This branch is ', '')}`))
.catch(console.error);
}
while (forks.length) {
await Promise.all(forks.splice(0, 2).map(getfork));
}
})();
原始(这会立即触发所有请求,如果 requests/s 超过 github 允许的数量,可能会将您拒之门外)
javascript:(() => {
/* while on the forks page, collect all the hrefs and pop off the first one (original repo) */
const forks = [...document.querySelectorAll('div.repo a:last-of-type')].map(x => x.href).slice(1);
for (const fork of forks) {
/* fetch the forked repo as html, search for the "This branch is [n commits ahead,] [m commits behind]", print it to console */
fetch(fork)
.then(x => x.text())
.then(html => console.log(`${fork}: ${html.match(/This branch is.*/).pop().replace('This branch is ', '')}`))
.catch(console.error);
}
})();
将打印如下内容:
https://github.com/user1/repo: 289 commits behind original:master.
https://github.com/user2/repo: 489 commits behind original:master.
https://github.com/user2/repo: 1 commit ahead, 501 commits behind original:master.
...
去安慰。
编辑:将注释替换为可粘贴的块注释
点击顶部的“Insights”,然后点击左侧的“Forks”后,以下小书签将信息直接打印到网页上,如下所示:
添加为小书签(或粘贴到控制台)的代码:
javascript:(async () => {
/* while on the forks page, collect all the hrefs and pop off the first one (original repo) */
const aTags = [...document.querySelectorAll('div.repo a:last-of-type')].slice(1);
for (const aTag of aTags) {
/* fetch the forked repo as html, search for the "This branch is [n commits ahead,] [m commits behind]", print it directly onto the web page */
await fetch(aTag.href)
.then(x => x.text())
.then(html => aTag.outerHTML += `${html.match(/This branch is.*/).pop().replace('This branch is', '').replace(/([0-9]+ commits? ahead)/, '<font color="#0c0"></font>').replace(/([0-9]+ commits? behind)/, '<font color="red"></font>')}`)
.catch(console.error);
}
})();
您也可以将代码粘贴到地址栏中,但请注意,某些浏览器在粘贴时会删除前导 javascript:
,因此您必须自己键入 javascript:
。或者复制除前导 j
之外的所有内容,键入 j
,然后粘贴其余部分。
修改自。
奖金
以下小书签还打印 ZIP 文件的链接:
添加为小书签(或粘贴到控制台)的代码:
javascript:(async () => {
/* while on the forks page, collect all the hrefs and pop off the first one (original repo) */
const aTags = [...document.querySelectorAll('div.repo a:last-of-type')].slice(1);
for (const aTag of aTags) {
/* fetch the forked repo as html, search for the "This branch is [n commits ahead,] [m commits behind]", print it directly onto the web page */
await fetch(aTag.href)
.then(x => x.text())
.then(html => aTag.outerHTML += `${html.match(/This branch is.*/).pop().replace('This branch is', '').replace(/([0-9]+ commits? ahead)/, '<font color="#0c0"></font>').replace(/([0-9]+ commits? behind)/, '<font color="red"></font>')}` + " <a " + `${html.match(/href="[^"]*\.zip">/).pop() + "Download ZIP</a>"}`)
.catch(console.error);
}
})();
这是一个 Python 脚本,用于 列出和克隆 所有前面的分支。
它不使用 API。因此它不受速率限制,也不需要身份验证。但如果 GitHub 网站设计发生变化,可能需要进行调整。
与其他答案中显示 ZIP 文件链接的小书签不同,此脚本还保存有关提交的信息,因为它使用 git clone
并创建一个包含概述的 commits.htm
文件。
import requests, re, os, sys, time
def content_from_url(url):
# TODO handle internet being off and stuff
text = requests.get(url).content
return text
def clone_ahead_forks(forklist_url):
forklist_htm = content_from_url(forklist_url)
with open("forklist.htm", "w") as text_file:
text_file.write(forklist_htm)
is_root = True
# not working if there are no forks: '<a class="(Link--secondary)?" href="(/([^/"]*)/[^/"]*)">'
for match in re.finditer('<a (class=""|data-pjax="#js-repo-pjax-container") href="(/([^/"]*)/[^/"]*)">', forklist_htm):
fork_url = 'https://github.com'+match.group(2)
fork_owner_login = match.group(3)
fork_htm = content_from_url(fork_url)
match2 = re.search('<div class="d-flex flex-auto">[^<]*?([0-9]+ commits? ahead(, [0-9]+ commits? behind)?)', fork_htm)
# TODO if website design changes, fallback onto checking whether 'ahead'/'behind'/'even with' appear only once on the entire page - in that case they are not part of the username etc.
sys.stdout.write('.')
if match2 or is_root:
if match2:
aheadness = match2.group(1) # for example '1 commit ahead, 2 commits behind'
else:
aheadness = 'root repo'
is_root = False # for subsequent iterations
dir = fork_owner_login+' ('+aheadness+')'
print dir
os.mkdir(dir)
os.chdir(dir)
# save commits.htm
commits_htm = content_from_url(fork_url+'/commits')
with open("commits.htm", "w") as text_file:
text_file.write(commits_htm)
# git clone
os.system('git clone '+fork_url+'.git')
print
# no need to recurse into forks of forks because they are all listed on the initial page and being traversed already
os.chdir('..')
base_path = os.getcwd()
match_disk_letter = re.search(r'^([a-zA-Z]:\)', base_path)
with open('repo_urls.txt') as url_file:
for url in url_file:
url = url.strip()
match = re.search('github.com/([^/]*)/([^/]*)$', url)
if match:
user_name = match.group(1)
repo_name = match.group(2)
print repo_name
dirname_for_forks = repo_name+' ('+user_name+')'
if not os.path.exists(dirname_for_forks):
url += "/network/members" # page that lists the forks
TMP_DIR = 'tmp_'+time.strftime("%Y%m%d-%H%M%S")
if match_disk_letter: # if Windows, i.e. if path starts with A:\ or so, run git in A:\tmp_... instead of .\tmp_..., in order to prevent "filename too long" errors
TMP_DIR = match_disk_letter.group(1)+TMP_DIR
print TMP_DIR
os.mkdir(TMP_DIR)
os.chdir(TMP_DIR)
clone_ahead_forks(url)
print
os.chdir(base_path)
os.rename(TMP_DIR, dirname_for_forks)
else:
print dirname_for_forks+' already exists, skipping.'
如果你用下面的内容制作文件repo_urls.txt
(你可以放几个URL,每行一个URL):
https://github.com/cifkao/tonnetz-viz
然后您将获得以下目录,每个目录都包含相应的克隆存储库:
tonnetz-viz (cifkao)
bakaiadam (2 commits ahead)
chumo (2 commits ahead, 4 commits behind)
cifkao (root repo)
codedot (76 commits ahead, 27 commits behind)
k-hatano (41 commits ahead)
shimafuri (11 commits ahead, 8 commits behind)
如果不行,试试earlier versions。
这里有一个 Python 脚本,用于列出和克隆前面的分叉。此脚本部分使用了 API,因此它触发了速率限制(您可以通过向脚本添加 GitHub API authentication 来扩展速率限制(不是无限地),请编辑或 post )。
最初我尝试完全使用 API,但是触发速率限制的速度太快,所以现在我使用 is_fork_ahead_HTML
而不是 is_fork_ahead_API
。如果 GitHub 网站设计发生变化,这可能需要进行调整。
由于速率限制,我更喜欢我在此处post编辑的其他答案。
import requests, json, os, re
def obj_from_json_from_url(url):
# TODO handle internet being off and stuff
text = requests.get(url).content
obj = json.loads(text)
return obj, text
def is_fork_ahead_API(fork, default_branch_of_parent):
""" Use the GitHub API to check whether `fork` is ahead.
This triggers the rate limit, so prefer the non-API version below instead.
"""
# Compare default branch of original repo with default branch of fork.
comparison, comparison_json = obj_from_json_from_url('https://api.github.com/repos/'+user+'/'+repo+'/compare/'+default_branch_of_parent+'...'+fork['owner']['login']+':'+fork['default_branch'])
if comparison['ahead_by']>0:
return comparison_json
else:
return False
def is_fork_ahead_HTML(fork):
""" Use the GitHub website to check whether `fork` is ahead.
"""
htm = requests.get(fork['html_url']).content
match = re.search('<div class="d-flex flex-auto">[^<]*?([0-9]+ commits? ahead(, [0-9]+ commits? behind)?)', htm)
# TODO if website design changes, fallback onto checking whether 'ahead'/'behind'/'even with' appear only once on the entire page - in that case they are not part of the username etc.
if match:
return match.group(1) # for example '1 commit ahead, 114 commits behind'
else:
return False
def clone_ahead_forks(user,repo):
obj, _ = obj_from_json_from_url('https://api.github.com/repos/'+user+'/'+repo)
default_branch_of_parent = obj["default_branch"]
page = 0
forks = None
while forks != [{}]:
page += 1
forks, _ = obj_from_json_from_url('https://api.github.com/repos/'+user+'/'+repo+'/forks?per_page=100&page='+str(page))
for fork in forks:
aheadness = is_fork_ahead_HTML(fork)
if aheadness:
#dir = fork['owner']['login']+' ('+str(comparison['ahead_by'])+' commits ahead, '+str(comparison['behind_by'])+'commits behind)'
dir = fork['owner']['login']+' ('+aheadness+')'
print dir
os.mkdir(dir)
os.chdir(dir)
os.system('git clone '+fork['clone_url'])
print
# recurse into forks of forks
if fork['forks_count']>0:
clone_ahead_forks(fork['owner']['login'], fork['name'])
os.chdir('..')
user = 'cifkao'
repo = 'tonnetz-viz'
clone_ahead_forks(user,repo)
这是一个使用 Github API 的 Python 脚本。我想包括日期和最后一次提交消息。如果您需要增加到 5k requests/hr.
,则需要包含个人访问令牌 (PAT)
用法:python3 list-forks.py https://github.com/itinance/react-native-fs
示例输出:
https://github.com/itinance/react-native-fs root 2021-11-04 "Merge pull request #1016 from mjgallag/make-react-native-windows-peer-dependency-optional make react-native-windows peer dependency optional"
https://github.com/AnimoApps/react-native-fs diverged +2 -160 [+1m 10d] "Improved comments to align with new PNG support in copyAssetsFileIOS"
https://github.com/twinedo/react-native-fs ahead +1 [+26d] "clear warn yellow new NativeEventEmitter()"
https://github.com/synonymdev/react-native-fs ahead +2 [+23d] "Merge pull request #1 from synonymdev/event-emitter-fix Event Emitter Fix"
https://github.com/kongyes/react-native-fs ahead +2 [+10d] "aa"
https://github.com/kamiky/react-native-fs diverged +1 -2 [-6d] "add copyCurrentAssetsVideoIOS function to retrieve current modified videos"
https://github.com/nikola166/react-native-fs diverged +1 -2 [-7d] "version"
https://github.com/morph3ux/react-native-fs diverged +1 -4 [-30d] "Update package.json"
https://github.com/broganm/react-native-fs diverged +2 -4 [-1m 7d] "Update RNFSManager.m"
https://github.com/k1mmm/react-native-fs diverged +1 -4 [-1m 14d] "Invalidate upload session Prevent memory leaks"
https://github.com/TickKleiner/react-native-fs diverged +1 -4 [-1m 24d] "addListener and removeListeners methods wass added to pass warning"
https://github.com/nerdyfactory/react-native-fs diverged +1 -8 [-2m 14d] "fix: applying change from https://github.com/itinance/react-native-fs/pull/944"
import requests, re, os, sys, time, json, datetime
from dateutil.relativedelta import relativedelta
from urllib.parse import urlparse
GITHUB_PAT = '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'
def json_from_url(url):
response = requests.get(url, headers={ 'Authorization': 'token {}'.format(GITHUB_PAT) })
return response.json()
def date_delta_to_text(date1, date2):
ret = []
date_delta = relativedelta(date2, date1)
sign = '+' if date1 < date2 else '-'
if date_delta.years != 0:
ret.append('{}y'.format(abs(date_delta.years)))
if date_delta.months != 0:
ret.append('{}m'.format(abs(date_delta.months)))
if date_delta.days != 0:
ret.append('{}d'.format(abs(date_delta.days)))
return '{}{}'.format(sign, ' '.join(ret))
def iso8601_date_to_date(date):
return datetime.datetime.strptime(date, '%Y-%m-%dT%H:%M:%SZ')
def date_to_text(date):
return date.strftime('%Y-%m-%d')
def process_repo(repo_author, repo_name, fork_of_fork):
page = 1
while 1:
forks_url = 'https://api.github.com/repos/{}/{}/forks?per_page=100&page={}'.format(repo_author, repo_name, page)
forks_json = json_from_url(forks_url)
if not forks_json:
break
for fork_info in forks_json:
fork_author = fork_info['owner']['login']
fork_name = fork_info['name']
forks_count = fork_info['forks_count']
fork_url = 'https://github.com/{}/{}'.format(fork_author, fork_name)
compare_url = 'https://api.github.com/repos/{}/{}/compare/master...{}:master'.format(repo_author, fork_name, fork_author)
compare_json = json_from_url(compare_url)
if 'status' in compare_json:
items = []
status = compare_json['status']
ahead_by = compare_json['ahead_by']
behind_by = compare_json['behind_by']
total_commits = compare_json['total_commits']
commits = compare_json['commits']
if fork_of_fork:
items.append(' ')
items.append(fork_url)
items.append(status)
if ahead_by != 0:
items.append('+{}'.format(ahead_by))
if behind_by != 0:
items.append('-{}'.format(behind_by))
if total_commits > 0:
last_commit = commits[total_commits-1];
commit = last_commit['commit']
author = commit['author']
date = iso8601_date_to_date(author['date'])
items.append('[{}]'.format(date_delta_to_text(root_date, date)))
items.append('"{}"'.format(commit['message'].replace('\n', ' ')))
if ahead_by > 0:
print(' '.join(items))
if forks_count > 0:
process_repo(fork_author, fork_name, True)
page += 1
url_parsed = urlparse(sys.argv[1].strip())
path_array = url_parsed.path.split('/')
root_author = path_array[1]
root_name = path_array[2]
root_url = 'https://github.com/{}/{}'.format(root_author, root_name)
commits_url = 'https://api.github.com/repos/{}/{}/commits/master'.format(root_author, root_name)
commits_json = json_from_url(commits_url)
commit = commits_json['commit']
author = commit['author']
root_date = iso8601_date_to_date(author['date'])
print('{} root {} "{}"'.format(root_url, date_to_text(root_date), commit['message'].replace('\n', ' ')));
process_repo(root_author, root_name, False)
useful-forks
useful-forks 是一个在线工具,它根据 ahead
标准过滤所有分叉。我认为它很好地满足了您的需求。 :)
对于你问题中的回购,你可以这样做:https://useful-forks.github.io/?repo=wummel/linkchecker
这应该会为您提供与(2022-04-02 上的运行)类似的结果:
也可以作为 Chrome 插件使用
如果您也想将其用作 Chrome 插件,您可以查看 GitHub 存储库:https://github.com/useful-forks/useful-forks.github.io#chrome-extension-wip
免责声明
我是这个项目的维护者
有时,我正在使用的某个软件(例如 linkchecker)的原始 GitHub 存储库很少或根本没有开发,而已经创建了很多分支(在这种情况:142,在撰写本文时)。
对于每个分叉,我想知道:
- 它有哪些分支在原始主分支之前提交
并且对于每个这样的分支:
- 比原来的提交次数多
- 它落后了多少次提交
有完全相同的渴望并写了一个刮板,它获取打印在呈现的 HTML 中的信息用于叉子:https://github.com/hbbio/forkizard
肯定不完美,只是临时解决方案。
active-forks 不完全符合我的要求,但它接近并且非常易于使用。
派对迟到了 - 我认为这是我第二次完成此 SO post 所以我将分享我的基于 js 的解决方案(我最终通过获取并搜索 html 页)。 您可以从中创建一个 bookmarklet,或者简单地将整个内容粘贴到控制台中。适用于 chromium 和 firefox:
编辑:如果页面上有超过 10 个左右的分叉,您可能会因为抓取太快而被锁定(网络中的 429 请求太多)。使用 async / await 代替:
javascript:(async () => {
/* while on the forks page, collect all the hrefs and pop off the first one (original repo) */
const forks = [...document.querySelectorAll('div.repo a:last-of-type')].map(x => x.href).slice(1);
for (const fork of forks) {
/* fetch the forked repo as html, search for the "This branch is [n commits ahead,] [m commits behind]", print it to console */
await fetch(fork)
.then(x => x.text())
.then(html => console.log(`${fork}: ${html.match(/This branch is.*/).pop().replace('This branch is ', '')}`))
.catch(console.error);
}
})();
或者你可以分批进行,但是很容易被锁在门外
javascript:(async () => {
/* while on the forks page, collect all the hrefs and pop off the first one (original repo) */
const forks = [...document.querySelectorAll('div.repo a:last-of-type')].map(x => x.href).slice(1);
getfork = (fork) => {
return fetch(fork)
.then(x => x.text())
.then(html => console.log(`${fork}: ${html.match(/This branch is.*/).pop().replace('This branch is ', '')}`))
.catch(console.error);
}
while (forks.length) {
await Promise.all(forks.splice(0, 2).map(getfork));
}
})();
原始(这会立即触发所有请求,如果 requests/s 超过 github 允许的数量,可能会将您拒之门外)
javascript:(() => {
/* while on the forks page, collect all the hrefs and pop off the first one (original repo) */
const forks = [...document.querySelectorAll('div.repo a:last-of-type')].map(x => x.href).slice(1);
for (const fork of forks) {
/* fetch the forked repo as html, search for the "This branch is [n commits ahead,] [m commits behind]", print it to console */
fetch(fork)
.then(x => x.text())
.then(html => console.log(`${fork}: ${html.match(/This branch is.*/).pop().replace('This branch is ', '')}`))
.catch(console.error);
}
})();
将打印如下内容:
https://github.com/user1/repo: 289 commits behind original:master.
https://github.com/user2/repo: 489 commits behind original:master.
https://github.com/user2/repo: 1 commit ahead, 501 commits behind original:master.
...
去安慰。
编辑:将注释替换为可粘贴的块注释
点击顶部的“Insights”,然后点击左侧的“Forks”后,以下小书签将信息直接打印到网页上,如下所示:
添加为小书签(或粘贴到控制台)的代码:
javascript:(async () => {
/* while on the forks page, collect all the hrefs and pop off the first one (original repo) */
const aTags = [...document.querySelectorAll('div.repo a:last-of-type')].slice(1);
for (const aTag of aTags) {
/* fetch the forked repo as html, search for the "This branch is [n commits ahead,] [m commits behind]", print it directly onto the web page */
await fetch(aTag.href)
.then(x => x.text())
.then(html => aTag.outerHTML += `${html.match(/This branch is.*/).pop().replace('This branch is', '').replace(/([0-9]+ commits? ahead)/, '<font color="#0c0"></font>').replace(/([0-9]+ commits? behind)/, '<font color="red"></font>')}`)
.catch(console.error);
}
})();
您也可以将代码粘贴到地址栏中,但请注意,某些浏览器在粘贴时会删除前导 javascript:
,因此您必须自己键入 javascript:
。或者复制除前导 j
之外的所有内容,键入 j
,然后粘贴其余部分。
修改自
奖金
以下小书签还打印 ZIP 文件的链接:
添加为小书签(或粘贴到控制台)的代码:
javascript:(async () => {
/* while on the forks page, collect all the hrefs and pop off the first one (original repo) */
const aTags = [...document.querySelectorAll('div.repo a:last-of-type')].slice(1);
for (const aTag of aTags) {
/* fetch the forked repo as html, search for the "This branch is [n commits ahead,] [m commits behind]", print it directly onto the web page */
await fetch(aTag.href)
.then(x => x.text())
.then(html => aTag.outerHTML += `${html.match(/This branch is.*/).pop().replace('This branch is', '').replace(/([0-9]+ commits? ahead)/, '<font color="#0c0"></font>').replace(/([0-9]+ commits? behind)/, '<font color="red"></font>')}` + " <a " + `${html.match(/href="[^"]*\.zip">/).pop() + "Download ZIP</a>"}`)
.catch(console.error);
}
})();
这是一个 Python 脚本,用于 列出和克隆 所有前面的分支。
它不使用 API。因此它不受速率限制,也不需要身份验证。但如果 GitHub 网站设计发生变化,可能需要进行调整。
与其他答案中显示 ZIP 文件链接的小书签不同,此脚本还保存有关提交的信息,因为它使用 git clone
并创建一个包含概述的 commits.htm
文件。
import requests, re, os, sys, time
def content_from_url(url):
# TODO handle internet being off and stuff
text = requests.get(url).content
return text
def clone_ahead_forks(forklist_url):
forklist_htm = content_from_url(forklist_url)
with open("forklist.htm", "w") as text_file:
text_file.write(forklist_htm)
is_root = True
# not working if there are no forks: '<a class="(Link--secondary)?" href="(/([^/"]*)/[^/"]*)">'
for match in re.finditer('<a (class=""|data-pjax="#js-repo-pjax-container") href="(/([^/"]*)/[^/"]*)">', forklist_htm):
fork_url = 'https://github.com'+match.group(2)
fork_owner_login = match.group(3)
fork_htm = content_from_url(fork_url)
match2 = re.search('<div class="d-flex flex-auto">[^<]*?([0-9]+ commits? ahead(, [0-9]+ commits? behind)?)', fork_htm)
# TODO if website design changes, fallback onto checking whether 'ahead'/'behind'/'even with' appear only once on the entire page - in that case they are not part of the username etc.
sys.stdout.write('.')
if match2 or is_root:
if match2:
aheadness = match2.group(1) # for example '1 commit ahead, 2 commits behind'
else:
aheadness = 'root repo'
is_root = False # for subsequent iterations
dir = fork_owner_login+' ('+aheadness+')'
print dir
os.mkdir(dir)
os.chdir(dir)
# save commits.htm
commits_htm = content_from_url(fork_url+'/commits')
with open("commits.htm", "w") as text_file:
text_file.write(commits_htm)
# git clone
os.system('git clone '+fork_url+'.git')
print
# no need to recurse into forks of forks because they are all listed on the initial page and being traversed already
os.chdir('..')
base_path = os.getcwd()
match_disk_letter = re.search(r'^([a-zA-Z]:\)', base_path)
with open('repo_urls.txt') as url_file:
for url in url_file:
url = url.strip()
match = re.search('github.com/([^/]*)/([^/]*)$', url)
if match:
user_name = match.group(1)
repo_name = match.group(2)
print repo_name
dirname_for_forks = repo_name+' ('+user_name+')'
if not os.path.exists(dirname_for_forks):
url += "/network/members" # page that lists the forks
TMP_DIR = 'tmp_'+time.strftime("%Y%m%d-%H%M%S")
if match_disk_letter: # if Windows, i.e. if path starts with A:\ or so, run git in A:\tmp_... instead of .\tmp_..., in order to prevent "filename too long" errors
TMP_DIR = match_disk_letter.group(1)+TMP_DIR
print TMP_DIR
os.mkdir(TMP_DIR)
os.chdir(TMP_DIR)
clone_ahead_forks(url)
print
os.chdir(base_path)
os.rename(TMP_DIR, dirname_for_forks)
else:
print dirname_for_forks+' already exists, skipping.'
如果你用下面的内容制作文件repo_urls.txt
(你可以放几个URL,每行一个URL):
https://github.com/cifkao/tonnetz-viz
然后您将获得以下目录,每个目录都包含相应的克隆存储库:
tonnetz-viz (cifkao)
bakaiadam (2 commits ahead)
chumo (2 commits ahead, 4 commits behind)
cifkao (root repo)
codedot (76 commits ahead, 27 commits behind)
k-hatano (41 commits ahead)
shimafuri (11 commits ahead, 8 commits behind)
如果不行,试试earlier versions。
这里有一个 Python 脚本,用于列出和克隆前面的分叉。此脚本部分使用了 API,因此它触发了速率限制(您可以通过向脚本添加 GitHub API authentication 来扩展速率限制(不是无限地),请编辑或 post )。
最初我尝试完全使用 API,但是触发速率限制的速度太快,所以现在我使用 is_fork_ahead_HTML
而不是 is_fork_ahead_API
。如果 GitHub 网站设计发生变化,这可能需要进行调整。
由于速率限制,我更喜欢我在此处post编辑的其他答案。
import requests, json, os, re
def obj_from_json_from_url(url):
# TODO handle internet being off and stuff
text = requests.get(url).content
obj = json.loads(text)
return obj, text
def is_fork_ahead_API(fork, default_branch_of_parent):
""" Use the GitHub API to check whether `fork` is ahead.
This triggers the rate limit, so prefer the non-API version below instead.
"""
# Compare default branch of original repo with default branch of fork.
comparison, comparison_json = obj_from_json_from_url('https://api.github.com/repos/'+user+'/'+repo+'/compare/'+default_branch_of_parent+'...'+fork['owner']['login']+':'+fork['default_branch'])
if comparison['ahead_by']>0:
return comparison_json
else:
return False
def is_fork_ahead_HTML(fork):
""" Use the GitHub website to check whether `fork` is ahead.
"""
htm = requests.get(fork['html_url']).content
match = re.search('<div class="d-flex flex-auto">[^<]*?([0-9]+ commits? ahead(, [0-9]+ commits? behind)?)', htm)
# TODO if website design changes, fallback onto checking whether 'ahead'/'behind'/'even with' appear only once on the entire page - in that case they are not part of the username etc.
if match:
return match.group(1) # for example '1 commit ahead, 114 commits behind'
else:
return False
def clone_ahead_forks(user,repo):
obj, _ = obj_from_json_from_url('https://api.github.com/repos/'+user+'/'+repo)
default_branch_of_parent = obj["default_branch"]
page = 0
forks = None
while forks != [{}]:
page += 1
forks, _ = obj_from_json_from_url('https://api.github.com/repos/'+user+'/'+repo+'/forks?per_page=100&page='+str(page))
for fork in forks:
aheadness = is_fork_ahead_HTML(fork)
if aheadness:
#dir = fork['owner']['login']+' ('+str(comparison['ahead_by'])+' commits ahead, '+str(comparison['behind_by'])+'commits behind)'
dir = fork['owner']['login']+' ('+aheadness+')'
print dir
os.mkdir(dir)
os.chdir(dir)
os.system('git clone '+fork['clone_url'])
print
# recurse into forks of forks
if fork['forks_count']>0:
clone_ahead_forks(fork['owner']['login'], fork['name'])
os.chdir('..')
user = 'cifkao'
repo = 'tonnetz-viz'
clone_ahead_forks(user,repo)
这是一个使用 Github API 的 Python 脚本。我想包括日期和最后一次提交消息。如果您需要增加到 5k requests/hr.
,则需要包含个人访问令牌 (PAT)用法:python3 list-forks.py https://github.com/itinance/react-native-fs
示例输出:
https://github.com/itinance/react-native-fs root 2021-11-04 "Merge pull request #1016 from mjgallag/make-react-native-windows-peer-dependency-optional make react-native-windows peer dependency optional"
https://github.com/AnimoApps/react-native-fs diverged +2 -160 [+1m 10d] "Improved comments to align with new PNG support in copyAssetsFileIOS"
https://github.com/twinedo/react-native-fs ahead +1 [+26d] "clear warn yellow new NativeEventEmitter()"
https://github.com/synonymdev/react-native-fs ahead +2 [+23d] "Merge pull request #1 from synonymdev/event-emitter-fix Event Emitter Fix"
https://github.com/kongyes/react-native-fs ahead +2 [+10d] "aa"
https://github.com/kamiky/react-native-fs diverged +1 -2 [-6d] "add copyCurrentAssetsVideoIOS function to retrieve current modified videos"
https://github.com/nikola166/react-native-fs diverged +1 -2 [-7d] "version"
https://github.com/morph3ux/react-native-fs diverged +1 -4 [-30d] "Update package.json"
https://github.com/broganm/react-native-fs diverged +2 -4 [-1m 7d] "Update RNFSManager.m"
https://github.com/k1mmm/react-native-fs diverged +1 -4 [-1m 14d] "Invalidate upload session Prevent memory leaks"
https://github.com/TickKleiner/react-native-fs diverged +1 -4 [-1m 24d] "addListener and removeListeners methods wass added to pass warning"
https://github.com/nerdyfactory/react-native-fs diverged +1 -8 [-2m 14d] "fix: applying change from https://github.com/itinance/react-native-fs/pull/944"
import requests, re, os, sys, time, json, datetime
from dateutil.relativedelta import relativedelta
from urllib.parse import urlparse
GITHUB_PAT = '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'
def json_from_url(url):
response = requests.get(url, headers={ 'Authorization': 'token {}'.format(GITHUB_PAT) })
return response.json()
def date_delta_to_text(date1, date2):
ret = []
date_delta = relativedelta(date2, date1)
sign = '+' if date1 < date2 else '-'
if date_delta.years != 0:
ret.append('{}y'.format(abs(date_delta.years)))
if date_delta.months != 0:
ret.append('{}m'.format(abs(date_delta.months)))
if date_delta.days != 0:
ret.append('{}d'.format(abs(date_delta.days)))
return '{}{}'.format(sign, ' '.join(ret))
def iso8601_date_to_date(date):
return datetime.datetime.strptime(date, '%Y-%m-%dT%H:%M:%SZ')
def date_to_text(date):
return date.strftime('%Y-%m-%d')
def process_repo(repo_author, repo_name, fork_of_fork):
page = 1
while 1:
forks_url = 'https://api.github.com/repos/{}/{}/forks?per_page=100&page={}'.format(repo_author, repo_name, page)
forks_json = json_from_url(forks_url)
if not forks_json:
break
for fork_info in forks_json:
fork_author = fork_info['owner']['login']
fork_name = fork_info['name']
forks_count = fork_info['forks_count']
fork_url = 'https://github.com/{}/{}'.format(fork_author, fork_name)
compare_url = 'https://api.github.com/repos/{}/{}/compare/master...{}:master'.format(repo_author, fork_name, fork_author)
compare_json = json_from_url(compare_url)
if 'status' in compare_json:
items = []
status = compare_json['status']
ahead_by = compare_json['ahead_by']
behind_by = compare_json['behind_by']
total_commits = compare_json['total_commits']
commits = compare_json['commits']
if fork_of_fork:
items.append(' ')
items.append(fork_url)
items.append(status)
if ahead_by != 0:
items.append('+{}'.format(ahead_by))
if behind_by != 0:
items.append('-{}'.format(behind_by))
if total_commits > 0:
last_commit = commits[total_commits-1];
commit = last_commit['commit']
author = commit['author']
date = iso8601_date_to_date(author['date'])
items.append('[{}]'.format(date_delta_to_text(root_date, date)))
items.append('"{}"'.format(commit['message'].replace('\n', ' ')))
if ahead_by > 0:
print(' '.join(items))
if forks_count > 0:
process_repo(fork_author, fork_name, True)
page += 1
url_parsed = urlparse(sys.argv[1].strip())
path_array = url_parsed.path.split('/')
root_author = path_array[1]
root_name = path_array[2]
root_url = 'https://github.com/{}/{}'.format(root_author, root_name)
commits_url = 'https://api.github.com/repos/{}/{}/commits/master'.format(root_author, root_name)
commits_json = json_from_url(commits_url)
commit = commits_json['commit']
author = commit['author']
root_date = iso8601_date_to_date(author['date'])
print('{} root {} "{}"'.format(root_url, date_to_text(root_date), commit['message'].replace('\n', ' ')));
process_repo(root_author, root_name, False)
useful-forks
useful-forks 是一个在线工具,它根据 ahead
标准过滤所有分叉。我认为它很好地满足了您的需求。 :)
对于你问题中的回购,你可以这样做:https://useful-forks.github.io/?repo=wummel/linkchecker
这应该会为您提供与(2022-04-02 上的运行)类似的结果:
也可以作为 Chrome 插件使用
如果您也想将其用作 Chrome 插件,您可以查看 GitHub 存储库:https://github.com/useful-forks/useful-forks.github.io#chrome-extension-wip
免责声明
我是这个项目的维护者