重启 EC2 实例后,Scylla 无法挂载 RAID 卷
Scylla fails to mount RAID volume after restarting EC2 instance
我是 Scylla 的新手。我按照 Scylla website 上的安装步骤在我的 AWS 帐户中设置了一个小型 4 节点 Scylla 集群。我在我的 EC2 实例上使用 Scylla ami。
如果我停止其中一个 EC2 实例然后重新启动它。当我尝试重新启动 Scylla 时收到消息 Failed mounting RAID volume!
。
我想我必须通过 运行 重新安装 RAID 卷:
scylla_raid_setup --raiddev /dev/md0 --disks /dev/nvme1n1,/dev/nvme2n1 --update-fstab --root /var/lib/scylla --volume-role all
但是,当我尝试启动 Scylla 时,我收到以下错误消息:
A dependency job for scylla-server.service failed. See 'journalctl -xe' for details.
似乎挂载失败,这是日志:
-- Subject: Unit var-lib-scylla.mount has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit var-lib-scylla.mount has failed.
--
-- The result is dependency.
Dependency failed for Scylla Server.
-- Subject: Unit scylla-server.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit scylla-server.service has failed.
--
-- The result is dependency.
May 05 13:23:56 systemd[1]: Dependency failed for Scylla JMX.
-- Subject: Unit scylla-jmx.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit scylla-jmx.service has failed.
--
-- The result is dependency.
May 05 13:23:56 systemd[1]: Job scylla-jmx.service/start failed with result 'dependency'.
May 05 13:23:56 systemd[1]: Dependency failed for Run Scylla Housekeeping daily mode.
-- Subject: Unit scylla-housekeeping-daily.timer has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit scylla-housekeeping-daily.timer has failed.
--
-- The result is dependency.
May 05 13:23:56 polkitd[4226]: Unregistered Authentication Agent for unix-process:7668:53288 (system bus name :1.20, object path /org/freedesktop/PolicyKit1/AuthenticationAge
May 05 13:23:56 systemd[1]: Job scylla-housekeeping-daily.timer/start failed with result 'dependency'.
May 05 13:23:56 sudo[7666]: pam_unix(sudo:session): session closed for user root
May 05 13:23:56 systemd[1]: Dependency failed for Run Scylla Housekeeping restart mode.
-- Subject: Unit scylla-housekeeping-restart.timer has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit scylla-housekeeping-restart.timer has failed.
--
-- The result is dependency.
May 05 13:23:56 systemd[1]: Job scylla-housekeeping-restart.timer/start failed with result 'dependency'.
May 05 13:23:56 systemd[1]: Job scylla-server.service/start failed with result 'dependency'.
May 05 13:23:56 systemd[1]: Job var-lib-scylla.mount/start failed with result 'dependency'.
May 05 13:23:56 systemd[1]: Job dev-disk-by\x2duuid-67fde517\x2d892a\x2d4a3f\x2d9e19\x2dac71c9bdd533.device/start failed with result 'timeout'.
我的下一步应该是什么?
这是磁盘:
Disk /dev/nvme1n1: 7500.0 GB, 7500000000000 bytes, 14648437500 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk /dev/nvme2n1: 7500.0 GB, 7500000000000 bytes, 14648437500 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk /dev/nvme0n1: 10.7 GB, 10737418240 bytes, 20971520 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: dos
Disk identifier: 0x000b0301
如果我在 scylla_raid_setup 的磁盘中包含 nvme0n1,那么它 returns: /dev/nvme0n1 is busy
.
否则,这是 scylla_raid_setup 输出:
Creating RAID0 for scylla using 2 disk(s): /dev/nvme2n1,/dev/nvme1n1
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md0 started.
meta-data=/dev/md0 isize=512 agcount=32, agsize=114438912 blks
= sectsz=512 attr=2, projid32bit=1
= crc=1 finobt=0, sparse=0
data = bsize=4096 blocks=3662043136, imaxpct=5
= sunit=256 swidth=512 blks
naming =version 2 bsize=4096 ascii-ci=0 ftype=1
log =internal log bsize=4096 blocks=521728, version=2
= sectsz=512 sunit=8 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
我的 /etc/fstab
文件如下所示:
UUID=0a84de8e-5bfe-43e7-992b-5bfff8cdce43 / xfs defaults 0 0
UUID="67fde517-892a-4a3f-9e19-ac71c9bdd533" /var/lib/scylla xfs noatime,nofail 0 0
UUID="24aab0fc-dc32-48de-bf6b-5a3d5bcd1f00" /var/lib/scylla xfs noatime,nofail 0 0
我删除了其中一个条目并尝试重新启动 Scylla。但是还是启动失败:(
在 运行 systemctl 启动 var-lib-scylla.mount 之后:
May 06 14:18:18 ip-172-31-14-126.ec2.internal polkitd[4760]: Registered Authentication Agent for unix-process:7789:57998 (system bus name :1.34 [/usr/bin/pkttyagent --notify-fd 5 --fallback], object path /org/freedesktop/PolicyKit1/AuthenticationAgent, locale en_GB.UTF-8)
May 06 14:19:48 ip-172-31-14-126.ec2.internal systemd[1]: Job dev-disk-by\x2duuid-17c356e1\x2d1ec9\x2d47d1\x2d8e98\x2d45182b7a9454.device/start timed out.
May 06 14:19:48 ip-172-31-14-126.ec2.internal systemd[1]: Timed out waiting for device dev-disk-by\x2duuid-17c356e1\x2d1ec9\x2d47d1\x2d8e98\x2d45182b7a9454.device.
-- Subject: Unit dev-disk-by\x2duuid-17c356e1\x2d1ec9\x2d47d1\x2d8e98\x2d45182b7a9454.device has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit dev-disk-by\x2duuid-17c356e1\x2d1ec9\x2d47d1\x2d8e98\x2d45182b7a9454.device has failed.
--
-- The result is timeout.
May 06 14:19:48 ip-172-31-14-126.ec2.internal systemd[1]: Dependency failed for /var/lib/scylla.
-- Subject: Unit var-lib-scylla.mount has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit var-lib-scylla.mount has failed.
--
-- The result is dependency.
May 06 14:19:48 systemd[1]: Job var-lib-scylla.mount/start failed with result 'dependency'.
May 06 14:19:48 systemd[1]: Job dev-disk-by\x2duuid-17c356e1\x2d1ec9\x2d47d1\x2d8e98\x2d45182b7a9454.device/start failed with result 'timeout'.
May 06 14:19:48 polkitd[4760]: Unregistered Authentication Agent for unix-process:7789:57998 (system bus name :1.34, object path /org/freedesktop/PolicyKit1/AuthenticationAgent, locale en_GB.UTF-8) (disconnected from bus)
May 06 14:19:48 sudo[7787]: pam_unix(sudo:session): session closed for user root
您可能应该检查 /etc/fstab
的内容,看看您是否有 2 个(或更多)scylla (/var/lib/scylla
) 条目。如果这样做,这可能是挂载失败的原因,应该只有 1 个条目。
如果 /etc/fstab
中有超过 1 个条目,或者没有 scylla 条目,scylla 服务将无法启动,这就是您在日志中看到的错误。
以下是您可以尝试的步骤
1) 列出所有磁盘
$ fdisk -l
2) 重新创建 RAID
$ sudo /usr/lib/scylla/scylla_raid_setup --disks /dev/nvme2n1,/dev/nvme3n1,/dev/nvme0n1,/dev/nvme1n1…………………<list all the disks you want to create a RAID volume>……………… --raiddev /dev/md0 --update-fstab --root /var/lib/scylla --volume-role all
(Alternative approach)
udevadm settle
mdadm --create --verbose --force --run /dev/md0 --level=0 -c1024 --raid-devices=<NUMBER OF DISKS> /dev/nvme0n1….<SPECIFY THE DISKS COMMA DELIMITED>
udevadm settle
3) 用XFS格式化raid0盘
$ mkfs.xfs /dev/md0 -f -K
4) 清除 fstab 中的旧条目
$ vi /etc/fstab ## delete the /var/lib/scylla line
5) 将新行添加到 fstab
$ echo "`blkid /dev/md0 | awk '{print }'` /var/lib/scylla xfs noatime 0 0" >> /etc/fstab
6) 重新加载守护进程
$ systemctl daemon-reload
7) 挂载文件系统
$ systemctl start var-lib-scylla.mount
8) 重新创建目录
$ mkdir -p "/var/lib/scylla/data"
$ mkdir -p "/var/lib/scylla/commitlog"
$ mkdir -p "/var/lib/scylla/hints"
$ mkdir -p "/var/lib/scylla/coredump"
9) 更改权限
$ chown -R scylla:scylla "/var/lib/scylla"
10) 启动 Scylla
$ systemctl start scylla-server
如果您运行遇到问题请告诉我...
我是 Scylla 的新手。我按照 Scylla website 上的安装步骤在我的 AWS 帐户中设置了一个小型 4 节点 Scylla 集群。我在我的 EC2 实例上使用 Scylla ami。
如果我停止其中一个 EC2 实例然后重新启动它。当我尝试重新启动 Scylla 时收到消息 Failed mounting RAID volume!
。
我想我必须通过 运行 重新安装 RAID 卷:
scylla_raid_setup --raiddev /dev/md0 --disks /dev/nvme1n1,/dev/nvme2n1 --update-fstab --root /var/lib/scylla --volume-role all
但是,当我尝试启动 Scylla 时,我收到以下错误消息:
A dependency job for scylla-server.service failed. See 'journalctl -xe' for details.
似乎挂载失败,这是日志:
-- Subject: Unit var-lib-scylla.mount has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit var-lib-scylla.mount has failed.
--
-- The result is dependency.
Dependency failed for Scylla Server.
-- Subject: Unit scylla-server.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit scylla-server.service has failed.
--
-- The result is dependency.
May 05 13:23:56 systemd[1]: Dependency failed for Scylla JMX.
-- Subject: Unit scylla-jmx.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit scylla-jmx.service has failed.
--
-- The result is dependency.
May 05 13:23:56 systemd[1]: Job scylla-jmx.service/start failed with result 'dependency'.
May 05 13:23:56 systemd[1]: Dependency failed for Run Scylla Housekeeping daily mode.
-- Subject: Unit scylla-housekeeping-daily.timer has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit scylla-housekeeping-daily.timer has failed.
--
-- The result is dependency.
May 05 13:23:56 polkitd[4226]: Unregistered Authentication Agent for unix-process:7668:53288 (system bus name :1.20, object path /org/freedesktop/PolicyKit1/AuthenticationAge
May 05 13:23:56 systemd[1]: Job scylla-housekeeping-daily.timer/start failed with result 'dependency'.
May 05 13:23:56 sudo[7666]: pam_unix(sudo:session): session closed for user root
May 05 13:23:56 systemd[1]: Dependency failed for Run Scylla Housekeeping restart mode.
-- Subject: Unit scylla-housekeeping-restart.timer has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit scylla-housekeeping-restart.timer has failed.
--
-- The result is dependency.
May 05 13:23:56 systemd[1]: Job scylla-housekeeping-restart.timer/start failed with result 'dependency'.
May 05 13:23:56 systemd[1]: Job scylla-server.service/start failed with result 'dependency'.
May 05 13:23:56 systemd[1]: Job var-lib-scylla.mount/start failed with result 'dependency'.
May 05 13:23:56 systemd[1]: Job dev-disk-by\x2duuid-67fde517\x2d892a\x2d4a3f\x2d9e19\x2dac71c9bdd533.device/start failed with result 'timeout'.
我的下一步应该是什么?
这是磁盘:
Disk /dev/nvme1n1: 7500.0 GB, 7500000000000 bytes, 14648437500 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk /dev/nvme2n1: 7500.0 GB, 7500000000000 bytes, 14648437500 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk /dev/nvme0n1: 10.7 GB, 10737418240 bytes, 20971520 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: dos
Disk identifier: 0x000b0301
如果我在 scylla_raid_setup 的磁盘中包含 nvme0n1,那么它 returns: /dev/nvme0n1 is busy
.
否则,这是 scylla_raid_setup 输出:
Creating RAID0 for scylla using 2 disk(s): /dev/nvme2n1,/dev/nvme1n1
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md0 started.
meta-data=/dev/md0 isize=512 agcount=32, agsize=114438912 blks
= sectsz=512 attr=2, projid32bit=1
= crc=1 finobt=0, sparse=0
data = bsize=4096 blocks=3662043136, imaxpct=5
= sunit=256 swidth=512 blks
naming =version 2 bsize=4096 ascii-ci=0 ftype=1
log =internal log bsize=4096 blocks=521728, version=2
= sectsz=512 sunit=8 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
我的 /etc/fstab
文件如下所示:
UUID=0a84de8e-5bfe-43e7-992b-5bfff8cdce43 / xfs defaults 0 0
UUID="67fde517-892a-4a3f-9e19-ac71c9bdd533" /var/lib/scylla xfs noatime,nofail 0 0
UUID="24aab0fc-dc32-48de-bf6b-5a3d5bcd1f00" /var/lib/scylla xfs noatime,nofail 0 0
我删除了其中一个条目并尝试重新启动 Scylla。但是还是启动失败:(
在 运行 systemctl 启动 var-lib-scylla.mount 之后:
May 06 14:18:18 ip-172-31-14-126.ec2.internal polkitd[4760]: Registered Authentication Agent for unix-process:7789:57998 (system bus name :1.34 [/usr/bin/pkttyagent --notify-fd 5 --fallback], object path /org/freedesktop/PolicyKit1/AuthenticationAgent, locale en_GB.UTF-8)
May 06 14:19:48 ip-172-31-14-126.ec2.internal systemd[1]: Job dev-disk-by\x2duuid-17c356e1\x2d1ec9\x2d47d1\x2d8e98\x2d45182b7a9454.device/start timed out.
May 06 14:19:48 ip-172-31-14-126.ec2.internal systemd[1]: Timed out waiting for device dev-disk-by\x2duuid-17c356e1\x2d1ec9\x2d47d1\x2d8e98\x2d45182b7a9454.device.
-- Subject: Unit dev-disk-by\x2duuid-17c356e1\x2d1ec9\x2d47d1\x2d8e98\x2d45182b7a9454.device has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit dev-disk-by\x2duuid-17c356e1\x2d1ec9\x2d47d1\x2d8e98\x2d45182b7a9454.device has failed.
--
-- The result is timeout.
May 06 14:19:48 ip-172-31-14-126.ec2.internal systemd[1]: Dependency failed for /var/lib/scylla.
-- Subject: Unit var-lib-scylla.mount has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit var-lib-scylla.mount has failed.
--
-- The result is dependency.
May 06 14:19:48 systemd[1]: Job var-lib-scylla.mount/start failed with result 'dependency'.
May 06 14:19:48 systemd[1]: Job dev-disk-by\x2duuid-17c356e1\x2d1ec9\x2d47d1\x2d8e98\x2d45182b7a9454.device/start failed with result 'timeout'.
May 06 14:19:48 polkitd[4760]: Unregistered Authentication Agent for unix-process:7789:57998 (system bus name :1.34, object path /org/freedesktop/PolicyKit1/AuthenticationAgent, locale en_GB.UTF-8) (disconnected from bus)
May 06 14:19:48 sudo[7787]: pam_unix(sudo:session): session closed for user root
您可能应该检查 /etc/fstab
的内容,看看您是否有 2 个(或更多)scylla (/var/lib/scylla
) 条目。如果这样做,这可能是挂载失败的原因,应该只有 1 个条目。
如果 /etc/fstab
中有超过 1 个条目,或者没有 scylla 条目,scylla 服务将无法启动,这就是您在日志中看到的错误。
以下是您可以尝试的步骤
1) 列出所有磁盘
$ fdisk -l
2) 重新创建 RAID
$ sudo /usr/lib/scylla/scylla_raid_setup --disks /dev/nvme2n1,/dev/nvme3n1,/dev/nvme0n1,/dev/nvme1n1…………………<list all the disks you want to create a RAID volume>……………… --raiddev /dev/md0 --update-fstab --root /var/lib/scylla --volume-role all
(Alternative approach)
udevadm settle
mdadm --create --verbose --force --run /dev/md0 --level=0 -c1024 --raid-devices=<NUMBER OF DISKS> /dev/nvme0n1….<SPECIFY THE DISKS COMMA DELIMITED>
udevadm settle
3) 用XFS格式化raid0盘
$ mkfs.xfs /dev/md0 -f -K
4) 清除 fstab 中的旧条目
$ vi /etc/fstab ## delete the /var/lib/scylla line
5) 将新行添加到 fstab
$ echo "`blkid /dev/md0 | awk '{print }'` /var/lib/scylla xfs noatime 0 0" >> /etc/fstab
6) 重新加载守护进程
$ systemctl daemon-reload
7) 挂载文件系统
$ systemctl start var-lib-scylla.mount
8) 重新创建目录
$ mkdir -p "/var/lib/scylla/data"
$ mkdir -p "/var/lib/scylla/commitlog"
$ mkdir -p "/var/lib/scylla/hints"
$ mkdir -p "/var/lib/scylla/coredump"
9) 更改权限
$ chown -R scylla:scylla "/var/lib/scylla"
10) 启动 Scylla
$ systemctl start scylla-server
如果您运行遇到问题请告诉我...