在 Dataflow worker 上使用 SSH Key 拉取私有库
Using SSH Key on Dataflow workers to pull private library
我正在设置数据流作业,对于此作业,工作人员需要访问私有 bitbucket 存储库以安装库来处理数据。为了向数据流工作者授予访问权限,我设置了一对 SSH 密钥(public & private)。我设法将私钥获取到我的数据流工作者。尝试通过 git+ssh pip 安装软件包时出现错误 Host key verification failed
.
我试图在数据流工作器上查找 .ssh/known_hosts
文件,但这在常规 VM 上并不那么直接。
或者,我已经通过以下命令自行设置了它,但效果不佳:
mkdir -p ~/.ssh
chmod 0700 ~/.ssh
ssh-keyscan bitbucket.org > ~/.ssh/known_hosts
我仍然收到 Host key verification failed
错误。
针对此问题的另一种建议修复方法是 运行 ssh-keygen -R bitbucket.org
但随后出现以下错误:
mkstemp: No such file or directory
对于 Dataflow Python SDK,您需要使用 setup.py
打包您的代码。所有worker启动时执行的命令都写成subprocess.Popen
。命令列表如下:
CUSTOM_COMMANDS = [
# decrypt key encrypted key in repository via gcloud kms
['gcloud', '-v'],
['gcloud', 'kms', 'decrypt', '--location', 'global', '--keyring',
'bitbucketpackages', '--key', 'package', '--plaintext-file',
'bb_package_key_decrypted', '--ciphertext-file', 'bb_package_key'],
['chmod', '700', 'bb_package_key_decrypted'],
# install git & ssh
['apt-get', 'update'],
['apt-get', 'install', '-y', 'openssh-server'],
['apt-get', 'install', '-y', 'git'],
# add bitbucket.org as known host
['mkdir', '-p', '~/.ssh'],
['chmod', '0700', '~/.ssh'],
['ssh-keyscan', 'bitbucket.org', '>', '~/.ssh/known_hosts'],
# other attempts to fix it
# ['ssh-keygen', '-R', 'bitbucket.org']
# pip install
['sh', '-c', 'GIT_SSH_COMMAND="ssh -i ./bb_package_key_decrypted" pip install git+ssh://git@bitbucket.org/team/repo.git'],
]
尝试更新 ssh-keyscan
以写入某个临时路径,然后将已知主机文件位置作为 GIT_SSH_COMMAND
的一部分传递。例如,我会将您的脚本更新为:
CUSTOM_COMMANDS = [
# decrypt key encrypted key in repository via gcloud kms
['gcloud', '-v'],
['gcloud', 'kms', 'decrypt', '--location', 'global', '--keyring',
'bitbucketpackages', '--key', 'package', '--plaintext-file',
'bb_package_key_decrypted', '--ciphertext-file', 'bb_package_key'],
['chmod', '700', 'bb_package_key_decrypted'],
# install git & ssh
['apt-get', 'update'],
['apt-get', 'install', '-y', 'openssh-server'],
['apt-get', 'install', '-y', 'git'],
# add bitbucket.org as known host
['mkdir', '-p', '~/.ssh'],
['chmod', '0700', '~/.ssh'],
['ssh-keyscan', 'bitbucket.org', '>', '/tmp/bit_bucket_known_hosts'],
# other attempts to fix it
# ['ssh-keygen', '-R', 'bitbucket.org']
# pip install
['sh', '-c', 'GIT_SSH_COMMAND="ssh -o UserKnownHostsFile=/tmp/bit_bucket_known_hosts -i ./bb_package_key_decrypted" pip install git+ssh://git@bitbucket.org/team/repo.git'],
]
我正在设置数据流作业,对于此作业,工作人员需要访问私有 bitbucket 存储库以安装库来处理数据。为了向数据流工作者授予访问权限,我设置了一对 SSH 密钥(public & private)。我设法将私钥获取到我的数据流工作者。尝试通过 git+ssh pip 安装软件包时出现错误 Host key verification failed
.
我试图在数据流工作器上查找 .ssh/known_hosts
文件,但这在常规 VM 上并不那么直接。
或者,我已经通过以下命令自行设置了它,但效果不佳:
mkdir -p ~/.ssh
chmod 0700 ~/.ssh
ssh-keyscan bitbucket.org > ~/.ssh/known_hosts
我仍然收到 Host key verification failed
错误。
针对此问题的另一种建议修复方法是 运行 ssh-keygen -R bitbucket.org
但随后出现以下错误:
mkstemp: No such file or directory
对于 Dataflow Python SDK,您需要使用 setup.py
打包您的代码。所有worker启动时执行的命令都写成subprocess.Popen
。命令列表如下:
CUSTOM_COMMANDS = [
# decrypt key encrypted key in repository via gcloud kms
['gcloud', '-v'],
['gcloud', 'kms', 'decrypt', '--location', 'global', '--keyring',
'bitbucketpackages', '--key', 'package', '--plaintext-file',
'bb_package_key_decrypted', '--ciphertext-file', 'bb_package_key'],
['chmod', '700', 'bb_package_key_decrypted'],
# install git & ssh
['apt-get', 'update'],
['apt-get', 'install', '-y', 'openssh-server'],
['apt-get', 'install', '-y', 'git'],
# add bitbucket.org as known host
['mkdir', '-p', '~/.ssh'],
['chmod', '0700', '~/.ssh'],
['ssh-keyscan', 'bitbucket.org', '>', '~/.ssh/known_hosts'],
# other attempts to fix it
# ['ssh-keygen', '-R', 'bitbucket.org']
# pip install
['sh', '-c', 'GIT_SSH_COMMAND="ssh -i ./bb_package_key_decrypted" pip install git+ssh://git@bitbucket.org/team/repo.git'],
]
尝试更新 ssh-keyscan
以写入某个临时路径,然后将已知主机文件位置作为 GIT_SSH_COMMAND
的一部分传递。例如,我会将您的脚本更新为:
CUSTOM_COMMANDS = [
# decrypt key encrypted key in repository via gcloud kms
['gcloud', '-v'],
['gcloud', 'kms', 'decrypt', '--location', 'global', '--keyring',
'bitbucketpackages', '--key', 'package', '--plaintext-file',
'bb_package_key_decrypted', '--ciphertext-file', 'bb_package_key'],
['chmod', '700', 'bb_package_key_decrypted'],
# install git & ssh
['apt-get', 'update'],
['apt-get', 'install', '-y', 'openssh-server'],
['apt-get', 'install', '-y', 'git'],
# add bitbucket.org as known host
['mkdir', '-p', '~/.ssh'],
['chmod', '0700', '~/.ssh'],
['ssh-keyscan', 'bitbucket.org', '>', '/tmp/bit_bucket_known_hosts'],
# other attempts to fix it
# ['ssh-keygen', '-R', 'bitbucket.org']
# pip install
['sh', '-c', 'GIT_SSH_COMMAND="ssh -o UserKnownHostsFile=/tmp/bit_bucket_known_hosts -i ./bb_package_key_decrypted" pip install git+ssh://git@bitbucket.org/team/repo.git'],
]