奇点容器中出现错误 "no free space in /var/cache/apt/archives",但磁盘未满
Error "no free space in /var/cache/apt/archives" in singularity container, but disk not full
我正在尝试重现一篇较早的研究论文的结果,需要 运行 一个带有 nvidia CUDA 9.0 和 torch 1.2.0 的奇点容器。
在本地,我有 Ubuntu 20.04 作为 VM,其中我 运行 singularity build
。我按照 guide 安装旧的 CUDA 版本。
这是食谱文件
#header
Bootstrap: docker
From: nvidia/cuda:9.0-runtime-ubuntu16.04
#Sections
%files
/home/timaie/rkn_tcml/cuda-repo-ubuntu1604-9-0-local_9.0.176-1_amd64.deb cuda-repo-ubuntu1604-9-0-local_9.0.176-1_amd64.deb
/home/timaie/rkn_tcml/RKN/*
%post
# necessary dependencies
pip install numpy scipy scikit-learn biopython pandas
dpkg -i cuda-repo-ubuntu1604-9-0-local_9.0.176-1_amd64.deb
apt-key add /var/cuda-repo-9-0-local/7fa2af80.pub
apt-get autoclean
apt-get autoremove
apt-get update
export CUDA_HOME="/usr/local/cuda-9.0"
export TORCH_EXTENSIONS_DIR="$PWD/tmp"
export PYTHONPATH=$PWD:$PYTHONPATH
%runscript
cd experiments
python train_scop.py --pooling max --embedding blosum62 --kmer-size 14 --alternating --sigma 0.4 --tfid 0
哪个 运行 很好,给我一个 image.simg 文件。然后我尝试通过 sudo singularity exec image.simg apt-get install cuda
安装 cuda 产生以下错误
0 upgraded, 823 newly installed, 0 to remove and 1 not upgraded.
Need to get 2661 MB of archives.
After this operation, 6822 MB of additional disk space will be used.
W: Not using locking for read only lock file /var/lib/dpkg/lock-frontend
W: Not using locking for read only lock file /var/lib/dpkg/lock
W: chown to _apt:root of directory /var/cache/apt/archives/partial failed - SetupAPTPartialDirectory (30: Read-only file system)
W: chmod 0700 of directory /var/cache/apt/archives/partial failed - SetupAPTPartialDirectory (30: Read-only file system)
W: Not using locking for read only lock file /var/cache/apt/archives/lock
E: You don't have enough free space in /var/cache/apt/archives/.
我在 docker 中读到了类似的问题,但我不知道关于 Singularity 的类似 docker system prune
的问题。
我也尝试通过 apt autoremove
和 apt autoclean
释放 space 但没有成功。
磁盘上应该有足够的 space 剩余,因为 运行ning df -H
给出
Filesystem Size Used Avail Use% Mounted on
udev 2,1G 0 2,1G 0% /dev
tmpfs 412M 1,4M 411M 1% /run
/dev/sda5 54G 19G 33G 36% /
tmpfs 2,1G 0 2,1G 0% /dev/shm
tmpfs 5,3M 4,1k 5,3M 1% /run/lock
tmpfs 2,1G 0 2,1G 0% /sys/fs/cgroup
/dev/loop0 132k 132k 0 100% /snap/bare/5
/dev/loop1 66M 66M 0 100% /snap/core20/1328
/dev/loop2 261M 261M 0 100% /snap/gnome-3-38-2004/99
/dev/loop3 66M 66M 0 100% /snap/core20/1405
/dev/loop4 69M 69M 0 100% /snap/gtk-common-themes/1519
/dev/loop5 46M 46M 0 100% /snap/snapd/15177
/dev/loop6 57M 57M 0 100% /snap/snap-store/558
/dev/loop7 46M 46M 0 100% /snap/snapd/14978
/dev/sda1 536M 4,1k 536M 1% /boot/efi
tmpfs 412M 25k 412M 1% /run/user/1000
有谁知道问题出在我的本地 Ubuntu 还是 nvidia docker 图像上?
感谢您的澄清。
如 singularity build
文档的 overview 部分所述
build can produce containers in two different formats that can be specified as follows.
- compressed read-only Singularity Image File (SIF) format suitable for production (default)
- writable (ch)root directory called a sandbox for interactive development (
--sandbox
option)
添加 --sandbox
应该可以使系统文件可写,这应该可以解决您的问题。
理想情况下,我建议将任何 apt-get install
命令添加到食谱文件中的 %post
部分。
我正在尝试重现一篇较早的研究论文的结果,需要 运行 一个带有 nvidia CUDA 9.0 和 torch 1.2.0 的奇点容器。
在本地,我有 Ubuntu 20.04 作为 VM,其中我 运行 singularity build
。我按照 guide 安装旧的 CUDA 版本。
这是食谱文件
#header
Bootstrap: docker
From: nvidia/cuda:9.0-runtime-ubuntu16.04
#Sections
%files
/home/timaie/rkn_tcml/cuda-repo-ubuntu1604-9-0-local_9.0.176-1_amd64.deb cuda-repo-ubuntu1604-9-0-local_9.0.176-1_amd64.deb
/home/timaie/rkn_tcml/RKN/*
%post
# necessary dependencies
pip install numpy scipy scikit-learn biopython pandas
dpkg -i cuda-repo-ubuntu1604-9-0-local_9.0.176-1_amd64.deb
apt-key add /var/cuda-repo-9-0-local/7fa2af80.pub
apt-get autoclean
apt-get autoremove
apt-get update
export CUDA_HOME="/usr/local/cuda-9.0"
export TORCH_EXTENSIONS_DIR="$PWD/tmp"
export PYTHONPATH=$PWD:$PYTHONPATH
%runscript
cd experiments
python train_scop.py --pooling max --embedding blosum62 --kmer-size 14 --alternating --sigma 0.4 --tfid 0
哪个 运行 很好,给我一个 image.simg 文件。然后我尝试通过 sudo singularity exec image.simg apt-get install cuda
安装 cuda 产生以下错误
0 upgraded, 823 newly installed, 0 to remove and 1 not upgraded.
Need to get 2661 MB of archives.
After this operation, 6822 MB of additional disk space will be used.
W: Not using locking for read only lock file /var/lib/dpkg/lock-frontend
W: Not using locking for read only lock file /var/lib/dpkg/lock
W: chown to _apt:root of directory /var/cache/apt/archives/partial failed - SetupAPTPartialDirectory (30: Read-only file system)
W: chmod 0700 of directory /var/cache/apt/archives/partial failed - SetupAPTPartialDirectory (30: Read-only file system)
W: Not using locking for read only lock file /var/cache/apt/archives/lock
E: You don't have enough free space in /var/cache/apt/archives/.
我在 docker docker system prune
的问题。
我也尝试通过 apt autoremove
和 apt autoclean
释放 space 但没有成功。
磁盘上应该有足够的 space 剩余,因为 运行ning df -H
给出
Filesystem Size Used Avail Use% Mounted on
udev 2,1G 0 2,1G 0% /dev
tmpfs 412M 1,4M 411M 1% /run
/dev/sda5 54G 19G 33G 36% /
tmpfs 2,1G 0 2,1G 0% /dev/shm
tmpfs 5,3M 4,1k 5,3M 1% /run/lock
tmpfs 2,1G 0 2,1G 0% /sys/fs/cgroup
/dev/loop0 132k 132k 0 100% /snap/bare/5
/dev/loop1 66M 66M 0 100% /snap/core20/1328
/dev/loop2 261M 261M 0 100% /snap/gnome-3-38-2004/99
/dev/loop3 66M 66M 0 100% /snap/core20/1405
/dev/loop4 69M 69M 0 100% /snap/gtk-common-themes/1519
/dev/loop5 46M 46M 0 100% /snap/snapd/15177
/dev/loop6 57M 57M 0 100% /snap/snap-store/558
/dev/loop7 46M 46M 0 100% /snap/snapd/14978
/dev/sda1 536M 4,1k 536M 1% /boot/efi
tmpfs 412M 25k 412M 1% /run/user/1000
有谁知道问题出在我的本地 Ubuntu 还是 nvidia docker 图像上?
感谢您的澄清。
如 singularity build
文档的 overview 部分所述
build can produce containers in two different formats that can be specified as follows.
- compressed read-only Singularity Image File (SIF) format suitable for production (default)
- writable (ch)root directory called a sandbox for interactive development (
--sandbox
option)
添加 --sandbox
应该可以使系统文件可写,这应该可以解决您的问题。
理想情况下,我建议将任何 apt-get install
命令添加到食谱文件中的 %post
部分。