无法通过 SSH 连接到 Google Cloud Engine。引导错误

Question

我无法通过 ssh 进入我的实例。我尝试了以下

创建了新的 ssh 密钥对并添加了项目，但这没有帮助。我在同一个项目中创建了一个全新的实例，我可以轻松地使用 ssh。所以，我认为 ssh 密钥不是问题。
“阻止项目范围的 SSH 密钥”也未选中
创建了一个机器映像并生成了一个新实例，它有同样的问题
使用“启动脚本”启用串行控制台，但这也无济于事。它根本不会接受密码。

    #! /bin/bash
    adduser serial1
    echo serial1:desperate-attempt | chpasswd
    usermod -aG google-sudoers serial1

我认为这不是磁盘 space 问题。实例有 10 GB 磁盘。我只写了一个日志文件，最后我检查了它是 ~50 MB。我也没有在控制台日志

我确实在“串行端口 1（控制台）”日志中看到了这些错误

Oct 16 16:29:01 instance-1 ntpd[668]: bind(21) AF_INET6 fe80::4001:aff:fe8e:2%2#123 flags 0x11 failed: Cannot assign requested address
Oct 16 16:29:01 instance-1 ntpd[668]: unable to create socket on eth0 (5) for fe80::4001:aff:fe8e:2%2#123
Oct 16 16:29:01 instance-1 ntpd[668]: failed to init interface for address fe80::4001:aff:fe8e:2%2
Oct 16 16:29:01 instance-1 ntpd[668]: Listening on routing socket on fd #21 for interface updates
Oct 16 16:29:02 instance-1 ntpd[668]: bind(24) AF_INET6 fe80::4001:aff:fe8e:2%2#123 flags 0x11 failed: Cannot assign requested address
Oct 16 16:29:02 instance-1 ntpd[668]: unable to create socket on eth0 (6) for fe80::4001:aff:fe8e:2%2#123
Oct 16 16:29:02 instance-1 ntpd[668]: failed to init interface for address fe80::4001:aff:fe8e:2%2
Oct 16 16:29:02 instance-1 google_instance_setup[663]: Traceback (most recent call last):
Oct 16 16:29:02 instance-1 google_instance_setup[663]:   File "/usr/bin/google_instance_setup", line 6, in <module>
Oct 16 16:29:02 instance-1 google_instance_setup[663]:     from pkg_resources import load_entry_point
Oct 16 16:29:02 instance-1 google_instance_setup[663]:   File "/usr/local/lib/python3.9/site-packages/pkg_resources/__init__.py", line 3257, in <module>
Oct 16 16:29:02 instance-1 google_instance_setup[663]:     def _initialize_master_working_set():
Oct 16 16:29:02 instance-1 google_instance_setup[663]:   File "/usr/local/lib/python3.9/site-packages/pkg_resources/__init__.py", line 3240, in _call_aside
Oct 16 16:29:02 instance-1 google_instance_setup[663]:     f(*args, **kwargs)
Oct 16 16:29:02 instance-1 google_instance_setup[663]:   File "/usr/local/lib/python3.9/site-packages/pkg_resources/__init__.py", line 3269, in _initialize_master_working_set
Oct 16 16:29:02 instance-1 google_instance_setup[663]:     working_set = WorkingSet._build_master()
Oct 16 16:29:02 instance-1 google_instance_setup[663]:   File "/usr/local/lib/python3.9/site-packages/pkg_resources/__init__.py", line 582, in _build_master
Oct 16 16:29:02 instance-1 google_instance_setup[663]:     ws.require(__requires__)
Oct 16 16:29:02 instance-1 google_instance_setup[663]:   File "/usr/local/lib/python3.9/site-packages/pkg_resources/__init__.py", line 899, in require
Oct 16 16:29:02 instance-1 google_instance_setup[663]:     needed = self.resolve(parse_requirements(requirements))
Oct 16 16:29:02 instance-1 google_instance_setup[663]:   File "/usr/local/lib/python3.9/site-packages/pkg_resources/__init__.py", line 785, in resolve
Oct 16 16:29:02 instance-1 google_instance_setup[663]:     raise DistributionNotFound(req, requirers)
Oct 16 16:29:02 instance-1 google_instance_setup[663]: pkg_resources.DistributionNotFound: The 'google-compute-engine==2.8.13' distribution was not found and is required by the application
[[0;1;31mFAILED[0m] Failed to start Google Compute Engine Instance Setup.

Oct 16 16:29:02 instance-1 google_instance_setup[663]: pkg_resources.DistributionNotFound: The 'google-compute-engine==2.8.13' distribution was not found and is required by the application
[[0;1;31mFAILED[0m] Failed to start Google Compute Engine Instance Setup.
See 'systemctl status google-instance-setup.service' for details.
         Starting NSS cache refresh...
Oct 16 16:29:02 instance-1 systemd[1]: google-instance-setup.service: Main process exited, code=exited, status=1/FAILURE
Oct 16 16:29:02 instance-1 systemd[1]: Failed to start Google Compute Engine Instance Setup.
Oct 16 16:29:02 instance-1 systemd[1]: google-instance-setup.service: Unit entered failed state.
Oct 16 16:29:02 instance-1 systemd[1]: google-instance-setup.service: Failed with result 'exit-code'.

上述错误在 google_accounts_daemon、google_metadata_script_runner、google_network_daemon、google_*、..

中重复出现

听起来有些软件包不是最新的。但是如何在不登录实例的情况下进行安装呢？有什么好的方法可以解决这个错误吗？

Answer 1

对于您的实例，Google 云包或 Python 安装或两者都已损坏。此问题使您无法登录。

我建议您创建一个新实例并将永久性磁盘从损坏的实例移动到新实例。

第 1 步：

在同一区域中创建一个新实例。一个微实例就可以了。

第 2 步：

打开云 Shell 提示（如果设置了 gcloud，这也可以在您的桌面上使用）。执行此命令。将 NAME 替换为您的实例名称（损坏的系统），将 DISK 替换为引导磁盘名称，将 ZONE 替换为系统所在的区域：

gcloud compute instances detach-disk NAME --disk=DISK --zone=ZONE

确保之前的命令没有报错。

第 3 步：

将此磁盘附加到您创建的新实例。

确保新 VM 实例是运行，然后再附加第二个磁盘。有时，如果有多个磁盘可启动，实例可能会混淆从哪个磁盘启动。

转到计算引擎 -> 虚拟机实例。单击您的实例。单击编辑。在“其他磁盘”下单击“添加项目”。对于名称 enter/select，您从损坏的实例中分离出来的磁盘。单击保存。

第 4 步：

通过 SSH 连接到连接了两个磁盘的新实例。

第 5 步：

仔细执行这些步骤。将第二个磁盘作为子目录挂载到根文件系统上。

成为超级用户。执行 sudo -s
执行命令df -h。确保未安装 /dev/sdb1。
为挂载点创建一个目录：mkdir /mnt/oldsystem
挂载第二个磁盘：mount /dev/sdb1 /mnt/oldsystem

您现在可以在路径 /mnt/oldsystem.

访问旧文件系统中的文件

无法通过 SSH 连接到 Google Cloud Engine。引导错误

Can't SSH into Google Cloud Engine. Boot errors

ssh

google-compute-engine

google-cloud-platform