如何更新 GitHub 操作 CI 以检测特洛伊木马代码提交(恶意 [双向] unicode 字符,python)
How to update GitHub Actions CI to detect Trojan Code commits (malicious [bidirectional] unicode chars, python)
如何更新我的 GitHub 操作 CI 管道,以便在特洛伊木马代码白皮书中展示的任何攻击变体作为 PR 提交到我的 GitHub 存储库, PR 要么自动拒绝提交,要么在 PR 中添加关于漏洞警告的评论。
背景:2021 年 10 月 30 日,Nicholas Boucher 和 Ross Anderson 发表了一篇题为 Trojan Source: Invisible Vulnerabilities -- which outlined several ways that unicode could be used maliciously in code submissions that are appear (pixel-for-pixel) identical to non-malicious code, but are--in-fact--malicious. Besides more-obvious "ambiguous characters" used to define & call distinct functions, they specifically describe how a clever attacker can utilize unicode bidirectional control characters to do some very nasty things.
的论文
更多背景:我管理一个托管在 GitHub 上的开源 python 项目。抛开本文发表后,GitHub added warnings 在查看包含潜在恶意 unicode 字符的代码时,在合并 PR 时 GitHub WUI 中无法在 PR 中直观地检测到这些问题。
我的问题是:如何保护自己免受尚未发现的恶意 unicode 提交?还有其他几乎不可能看到的漏洞?
我可以在我的 GitHub 操作 CI 管道中添加什么来警告我有关用户贡献的 python 代码中不可见的危险?
EDIT: Examples that should be caught include the following python snippets:
您可以将工作流程添加到 GitHub 操作管道,以检测非 ascii 字符并自动向 PR 评论警告。
将此添加到您的存储库根目录中的 .github/workflows/unicode_warn.yml
中:
################################################################################
# File: .github/workflows/unicode_warn.yml
# Version: 0.1
# Purpose: Detects Unicode in PRs and comments the results of findings in PR
# * https://tech.michaelaltfield.net/bidi-unicode-github-defense/
# Authors: Michael Altfield <michael@michaelaltfield.net>
# Created: 2021-11-20
# Updated: 2021-11-20
################################################################################
name: malicious_sanity_checks
# execute this workflow automatically on all PRs
on: [pull_request]
jobs:
unicode_warn:
runs-on: ubuntu-latest
container: debian:bullseye-slim
steps:
- name: Prereqs
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
apt-get update
apt-get install -y git bsdmainutils
git clone "https://token:${GITHUB_TOKEN}@github.com/${GITHUB_REPOSITORY}.git" .
shell: bash
- name: Check diff for unicode
id: unicode_diff
run: |
set -x
diff=`git diff --unified=0 ${{ github.event.pull_request.base.sha }} ${{ github.event.pull_request.head.sha }} | grep -E "^[+]" | grep -Ev '^(--- a/|\+\+\+ b/)'`
unicode_diff=`echo -n "${diff}" | grep -oP "[^\x00-\x7F]*"`
unicode_grep_exit_code=$?
echo "${unicode_diff}"
unicode_diff_hexdump=`echo -n "${unicode_diff}" | hd`
echo "${unicode_diff_hexdump}"
# did we select any unicode characters?
if [[ "${unicode_diff_hexdump}" == "" ]]; then
# we didn't find any unicode characters
human_result="INFO: No unicode characters found in PR's commits"
echo "${human_result}"
else
# we found at least 1 unicode character
human_result="^^ WARNING: Unicode characters found in diff!"
echo "${human_result}"
echo "${diff}"
fi
echo "UNICODE_HUMAN_RESULT=${human_result}" >> $GITHUB_ENV
shell: bash {0}
# leave a comment on the PR. See also
# *
# make sure this doesn't open command injection risks
# * https://github.com/victoriadrake/github-guestbook/issues/1#issuecomment-657121754
- name: Leave comment on PR
uses: actions/github-script@v5
with:
github-token: ${{secrets.GITHUB_TOKEN}}
script: |
github.rest.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
body: "${{ env.UNICODE_HUMAN_RESULT }}"
})
以上文件定义了一个名为 malicious_sanity_checks 的工作流,其中包含名为 unicode_warn 的作业。此作业包含多个步骤,每次在您的存储库中创建新 PR 时都会执行这些步骤:
- Prereqs - 首先,安装基本依赖,如 git 和 hd
- 检查 unicode 的差异 - 一个简单的 BASH 脚本使用 grep 检测 PR 的待合并提交的差异中的非 ascii 字符
- 在 PR 上发表评论 - 在 PR 上添加评论,指示提交是否包含 unicode 字符
有关此问题以及可以保护您免受木马源漏洞攻击的更多信息,请参阅 source:
另请参阅
如何更新我的 GitHub 操作 CI 管道,以便在特洛伊木马代码白皮书中展示的任何攻击变体作为 PR 提交到我的 GitHub 存储库, PR 要么自动拒绝提交,要么在 PR 中添加关于漏洞警告的评论。
背景:2021 年 10 月 30 日,Nicholas Boucher 和 Ross Anderson 发表了一篇题为 Trojan Source: Invisible Vulnerabilities -- which outlined several ways that unicode could be used maliciously in code submissions that are appear (pixel-for-pixel) identical to non-malicious code, but are--in-fact--malicious. Besides more-obvious "ambiguous characters" used to define & call distinct functions, they specifically describe how a clever attacker can utilize unicode bidirectional control characters to do some very nasty things.
的论文更多背景:我管理一个托管在 GitHub 上的开源 python 项目。抛开本文发表后,GitHub added warnings 在查看包含潜在恶意 unicode 字符的代码时,在合并 PR 时 GitHub WUI 中无法在 PR 中直观地检测到这些问题。
我的问题是:如何保护自己免受尚未发现的恶意 unicode 提交?还有其他几乎不可能看到的漏洞?
我可以在我的 GitHub 操作 CI 管道中添加什么来警告我有关用户贡献的 python 代码中不可见的危险?
EDIT: Examples that should be caught include the following python snippets:
您可以将工作流程添加到 GitHub 操作管道,以检测非 ascii 字符并自动向 PR 评论警告。
将此添加到您的存储库根目录中的 .github/workflows/unicode_warn.yml
中:
################################################################################
# File: .github/workflows/unicode_warn.yml
# Version: 0.1
# Purpose: Detects Unicode in PRs and comments the results of findings in PR
# * https://tech.michaelaltfield.net/bidi-unicode-github-defense/
# Authors: Michael Altfield <michael@michaelaltfield.net>
# Created: 2021-11-20
# Updated: 2021-11-20
################################################################################
name: malicious_sanity_checks
# execute this workflow automatically on all PRs
on: [pull_request]
jobs:
unicode_warn:
runs-on: ubuntu-latest
container: debian:bullseye-slim
steps:
- name: Prereqs
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
apt-get update
apt-get install -y git bsdmainutils
git clone "https://token:${GITHUB_TOKEN}@github.com/${GITHUB_REPOSITORY}.git" .
shell: bash
- name: Check diff for unicode
id: unicode_diff
run: |
set -x
diff=`git diff --unified=0 ${{ github.event.pull_request.base.sha }} ${{ github.event.pull_request.head.sha }} | grep -E "^[+]" | grep -Ev '^(--- a/|\+\+\+ b/)'`
unicode_diff=`echo -n "${diff}" | grep -oP "[^\x00-\x7F]*"`
unicode_grep_exit_code=$?
echo "${unicode_diff}"
unicode_diff_hexdump=`echo -n "${unicode_diff}" | hd`
echo "${unicode_diff_hexdump}"
# did we select any unicode characters?
if [[ "${unicode_diff_hexdump}" == "" ]]; then
# we didn't find any unicode characters
human_result="INFO: No unicode characters found in PR's commits"
echo "${human_result}"
else
# we found at least 1 unicode character
human_result="^^ WARNING: Unicode characters found in diff!"
echo "${human_result}"
echo "${diff}"
fi
echo "UNICODE_HUMAN_RESULT=${human_result}" >> $GITHUB_ENV
shell: bash {0}
# leave a comment on the PR. See also
# *
# make sure this doesn't open command injection risks
# * https://github.com/victoriadrake/github-guestbook/issues/1#issuecomment-657121754
- name: Leave comment on PR
uses: actions/github-script@v5
with:
github-token: ${{secrets.GITHUB_TOKEN}}
script: |
github.rest.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
body: "${{ env.UNICODE_HUMAN_RESULT }}"
})
以上文件定义了一个名为 malicious_sanity_checks 的工作流,其中包含名为 unicode_warn 的作业。此作业包含多个步骤,每次在您的存储库中创建新 PR 时都会执行这些步骤:
- Prereqs - 首先,安装基本依赖,如 git 和 hd
- 检查 unicode 的差异 - 一个简单的 BASH 脚本使用 grep 检测 PR 的待合并提交的差异中的非 ascii 字符
- 在 PR 上发表评论 - 在 PR 上添加评论,指示提交是否包含 unicode 字符
有关此问题以及可以保护您免受木马源漏洞攻击的更多信息,请参阅 source: