CannotStartContainerError: API error (400): OCI runtime create failed: container_linux.go:348

CannotStartContainerError: API error (400): OCI runtime create failed: container_linux.go:348

我正在尝试通过 AWS Batch 运行 一个脚本,遵循 here. In particular, the entry point script is the same 中的教程:它是一个从 S3 存储桶下载要在 AWS Batch 中执行的代码的脚本。但是,无论我如何尝试在 AWS 上执行它,我总是收到:

CannotStartContainerError: API error (400): OCI runtime create failed: 
  container_linux.go:348: starting container process caused "exec:
  \"/usr/local/bin/fetch_and_run.sh\": 
  stat /usr/local/bin/fetch_and_run.sh: no such file or directory": unknown

我可以在本地启动同一个容器。

我使用以下命令从 awscli 启动该过程:

aws batch submit-job --job-name mss_dev --job-definition mapper \
  --job-queue bio-job-queue \
  --container-overrides '{"environment": \
  [{"name": "BATCH_FILE_S3_URL", "value": "s3://test/myjob.sh"}, \
   {"name": "BATCH_FILE_TYPE", "value": "script"}], \
   "command":["/usr/local/bin/fetch_and_run.sh"]}'

我的 Dockerfile 如下:

FROM amazonlinux:latest

# General dependencies and user
## aws-cli installed twice (here for root, later for user)
RUN yum -y install which unzip tar wget aws-cli curl sudo
RUN yum -y groupinstall 'Development Tools'
RUN yum -y install gcc git curl make zlib-devel bzip2 bzip2-devel readline-devel sqlite sqlite-devel openssl openssl-devel
RUN yum -y install java-1.8.0-openjdk.x86_64
## User and work directory
RUN groupadd -r user && useradd -mr -g user -d /home/user -s /sbin/nologin -c "Docker image user" user
RUN echo "user ALL=(ALL) NOPASSWD: ALL" >> /etc/sudoers
ENV HOME /home/user
## Change user to user
USER user
ENV USER user
RUN sh -c "$(curl -fsSL https://raw.githubusercontent.com/Linuxbrew/install/master/install.sh)" && echo 'export PATH="/home/linuxbrew/.linuxbrew/bin:$PATH"' >>~/.profile
## GNU parallel 10 seconds installation
#WORKDIR $HOME/tools/parallel
#RUN (wget -O - pi.dk/3 || curl pi.dk/3/ || fetch -o - http://pi.dk/3) | bash
# RUN brew install gcc
ENV PATH "/home/linuxbrew/.linuxbrew/bin:$PATH"
RUN brew install parallel

# Pyenv
WORKDIR $HOME
RUN git clone git://github.com/yyuu/pyenv.git .pyenv

ENV PYENV_ROOT $HOME/.pyenv
ENV PATH $PYENV_ROOT/shims:$PYENV_ROOT/bin:$PATH

# Python3
RUN pyenv install 3.6.5
RUN pyenv global 3.6.5
RUN pyenv rehash

# Python3 modules
RUN pip install --upgrade pip
RUN pip install --upgrade awscli pandas scipy numpy kneed

# STAR
RUN mkdir -p $HOME/tools/STAR
WORKDIR $HOME/tools/STAR
RUN wget https://github.com/alexdobin/STAR/archive/2.6.1b.tar.gz && tar xvf 2.6.1b.tar.gz

# DropSeq
RUN mkdir -p $HOME/tools/DropSeq
WORKDIR $HOME/tools/DropSeq
RUN wget https://github.com/broadinstitute/Drop-seq/releases/download/v1.13/Drop-seq_tools-1.13.zip && unzip Drop-seq_tools-1.13.zip

# Reference and other files should be downloaded during execution
RUN mkdir -p $HOME/data
RUN mkdir -p $HOME/results
COPY --chown=user:user code /home/user/code

# Copy main files and set entrypoint
WORKDIR /tmp
ADD fetch_and_run.sh /usr/local/bin/fetch_and_run.sh
USER nobody
ENTRYPOINT ["/usr/local/bin/fetch_and_run.sh"]
# To debug
# ENTRYPOINT ["/bin/bash"]

罪魁祸首在作业定义中(从 AWS 控制台,参见 here 中的 "Create a job definition")。 对于 ECR 存储库 URI,我忘记使用更新图像的 URI(例如 012345678901.dkr.ecr.us-east-1.amazonaws.com/awsbatch/fetch_and_run),我使用的是默认的 amazonlinux 图像。

主要提示是我能够在本地运行它。