如何在 Snakemake 中使用奇点和 conda 包装器

How to use singularity and conda wrappers in Snakemake

TLDR 我收到以下错误:

The 'conda' command is not available inside your singularity container image. Snakemake mounts your conda installation into singularity. Sometimes, this can fail because of shell restrictions. It has been tested to work with docker://ubuntu, but it e.g. fails with docker://bash

我创建了一个 Snakemake 工作流程,并通过 Snakemake wrappers: .

shell: 命令转换为基于规则的包管理

但是,我 运行 在 HPC 上遇到了问题 运行,其中一名 HPC 支持人员强烈建议不要在任何 HPC 系统上使用 conda,因为:

"if the builder [of wrapper] is not super careful, dynamic libraries present in the conda environment that relies on the host libs (there are always a couple present because builder are most of the time carefree) will break. I think that relying on Singularity for your pipeline would make for a more robust system." - Anon

我周末读了一些书 according to this document, it's possible to combine containers with conda-based package management;通过定义全局 conda docker 容器和每个规则 yaml 文件。

Note: In contrast to the example in the link above (Figure 5.4), which uses a predefined yaml and shell: command, here I've use conda wrappers which download these yaml files into the Singularity container (if I'm thinking correctly) so I thought should function the same - see the Note: at the end though...

Snakefileconfig.yamlsamples.txt

Snakefile

# Directories------------------------------------------------------------------
configfile: "config.yaml"

# Setting the names of all directories
dir_list = ["REF_DIR", "LOG_DIR", "BENCHMARK_DIR", "QC_DIR", "TRIM_DIR", "ALIGN_DIR", "MARKDUP_DIR", "CALLING_DIR", "ANNOT_DIR"]
dir_names = ["refs", "logs", "benchmarks", "qc", "trimming", "alignment", "mark_duplicates", "variant_calling", "annotation"]
dirs_dict = dict(zip(dir_list, dir_names))

import os
import pandas as pd
# getting the samples information (names, path to r1 & r2) from samples.txt
samples_information = pd.read_csv("samples.txt", sep='\t', index_col=False)
# get a list of the sample names
sample_names = list(samples_information['sample'])
sample_locations = list(samples_information['location'])
samples_dict = dict(zip(sample_names, sample_locations))
# get number of samples
len_samples = len(sample_names)


# Singularity with conda wrappers

singularity: "docker://continuumio/miniconda3:4.5.11"

# Rules -----------------------------------------------------------------------

rule all:
    input:
        "resources/vep/plugins",
        "resources/vep/cache"

rule download_vep_plugins:
    output:
        directory("resources/vep/plugins")
    params:
        release=100
    resources:
        mem=1000,
        time=30
    wrapper:
        "0.66.0/bio/vep/plugins"

rule get_vep_cache:
    output:
        directory("resources/vep/cache")
    params:
        species="caenorhabditis_elegans",
        build="WBcel235",
        release="100"
    resources:
        mem=1000,
        time=30
    log:
        "logs/vep/cache.log"
    cache: True  # save space and time with between workflow caching (see docs)
    wrapper:
        "0.66.0/bio/vep/cache"

config.yaml

# Files
REF_GENOME: "c_elegans.PRJNA13758.WS265.genomic.fa"
GENOME_ANNOTATION: "c_elegans.PRJNA13758.WS265.annotations.gff3"

# Tools
QC_TOOL: "fastQC"
TRIM_TOOL: "trimmomatic"
ALIGN_TOOL: "bwa"
MARKDUP_TOOL: "picard"
CALLING_TOOL: "varscan"
ANNOT_TOOL: "vep"

samples.txt

sample  location
MTG324  /home/moldach/wrappers/SUBSET/MTG324_SUBSET

投稿

snakemake --profile slurm --use-singularity --use-conda --jobs 2

日志

Workflow defines that rule get_vep_cache is eligible for caching between workflows (use the --cache argument to enable this).
Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cores: 1 (use --cores to define parallelism)
Rules claiming more threads will be scaled down.
Job counts:
    count   jobs
        1   get_vep_cache
        1

[Mon Sep 21 15:35:50 2020]
rule get_vep_cache:
    output: resources/vep/cache
    log: logs/vep/cache.log
    jobid: 0
    resources: mem=1000, time=30

Activating singularity image /home/moldach/wrappers/SUBSET/VEP/.snakemake/singularity/d7617773b315c3abcb29e0484085ed06.simg
Activating conda environment: /home/moldach/wrappers/SUBSET/VEP/.snakemake/conda/774ea575
[Mon Sep 21 15:36:38 2020]
Finished job 0.
1 of 1 steps (100%) done

Note: Leaving --use-conda out of the submission of the workflow will cause an error for get_vep_cache: - /bin/bash: vep_install: command not found

Workflow defines that rule get_vep_cache is eligible for caching between workflows (use the --cache argument to enable this).
Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cores: 1 (use --cores to define parallelism)
Rules claiming more threads will be scaled down.
Job counts:
    count   jobs
        1   download_vep_plugins
        1

[Mon Sep 21 15:35:50 2020]
rule download_vep_plugins:
    output: resources/vep/plugins
    jobid: 0
    resources: mem=1000, time=30

Activating singularity image /home/moldach/wrappers/SUBSET/VEP/.snakemake/singularity/d7617773b315c3abcb29e0484085ed06.simg
Activating conda environment: /home/moldach/wrappers/SUBSET/VEP/.snakemake/conda/9f602d9a
[Mon Sep 21 15:35:56 2020]
Finished job 0.
1 of 1 steps (100%) done

添加第三条规则时出现问题,fastq:

已更新Snakefile

# Directories------------------------------------------------------------------
configfile: "config.yaml"

# Setting the names of all directories
dir_list = ["REF_DIR", "LOG_DIR", "BENCHMARK_DIR", "QC_DIR", "TRIM_DIR", "ALIGN_DIR", "MARKDUP_DIR", "CALLING_DIR", "ANNOT_DIR"]
dir_names = ["refs", "logs", "benchmarks", "qc", "trimming", "alignment", "mark_duplicates", "variant_calling", "annotation"]
dirs_dict = dict(zip(dir_list, dir_names))

import os
import pandas as pd
# getting the samples information (names, path to r1 & r2) from samples.txt
samples_information = pd.read_csv("samples.txt", sep='\t', index_col=False)
# get a list of the sample names
sample_names = list(samples_information['sample'])
sample_locations = list(samples_information['location'])
samples_dict = dict(zip(sample_names, sample_locations))
# get number of samples
len_samples = len(sample_names)


# Singularity with conda wrappers

singularity: "docker://continuumio/miniconda3:4.5.11"

# Rules -----------------------------------------------------------------------

rule all:
    input:
        "resources/vep/plugins",
        "resources/vep/cache",
        expand('{QC_DIR}/{QC_TOOL}/before_trim/{sample}_{pair}_fastqc.{ext}', QC_DIR=dirs_dict["QC_DIR"], QC_TOOL=config["QC_TOOL"], sample=sample_names, pair=['R1', 'R2'], ext=['html', 'zip'])

rule download_vep_plugins:
    output:
        directory("resources/vep/plugins")
    params:
        release=100
    resources:
        mem=1000,
        time=30
    wrapper:
        "0.66.0/bio/vep/plugins"

rule get_vep_cache:
    output:
        directory("resources/vep/cache")
    params:
        species="caenorhabditis_elegans",
        build="WBcel235",
        release="100"
    resources:
        mem=1000,
        time=30
    log:
        "logs/vep/cache.log"
    cache: True  # save space and time with between workflow caching (see docs)
    wrapper:
        "0.66.0/bio/vep/cache"

def getHome(sample):
  return(list(os.path.join(samples_dict[sample],"{0}_{1}.fastq.gz".format(sample,pair)) for pair in ['R1','R2']))

rule qc_before_trim_r1:
    input:
        r1=lambda wildcards: getHome(wildcards.sample)[0]
    output:
        html=os.path.join(dirs_dict["QC_DIR"],config["QC_TOOL"],"before_trim","{sample}_R1_fastqc.html"),
        zip=os.path.join(dirs_dict["QC_DIR"],config["QC_TOOL"],"before_trim","{sample}_R1_fastqc.zip"),
    params:
         dir=os.path.join(dirs_dict["QC_DIR"],config["QC_TOOL"],"before_trim")
    log:
        os.path.join(dirs_dict["LOG_DIR"],config["QC_TOOL"],"{sample}_R1.log")
    resources:
        mem=1000,
        time=30
    singularity:
        "https://depot.galaxyproject.org/singularity/fastqc:0.11.9--0"
    threads: 1
    message: """--- Quality check of raw data with FastQC before trimming."""
    wrapper:
         "0.66.0/bio/fastqc"

rule qc_before_trim_r2:
    input:
        r1=lambda wildcards: getHome(wildcards.sample)[1]
    output:
        html=os.path.join(dirs_dict["QC_DIR"],config["QC_TOOL"],"before_trim","{sample}_R2_fastqc.html"),
        zip=os.path.join(dirs_dict["QC_DIR"],config["QC_TOOL"],"before_trim","{sample}_R2_fastqc.zip"),
    params:
         dir=os.path.join(dirs_dict["QC_DIR"],config["QC_TOOL"],"before_trim")
    log:
        os.path.join(dirs_dict["LOG_DIR"],config["QC_TOOL"],"{sample}_R2.log")
    resources:
        mem=1000,
        time=30
    singularity:
        "https://depot.galaxyproject.org/singularity/fastqc:0.11.9--0"
    threads: 1
    message: """--- Quality check of raw data with FastQC before trimming."""
    wrapper:
        "0.66.0/bio/fastqc"

nohup.out

中报告错误
Building DAG of jobs...
Pulling singularity image https://depot.galaxyproject.org/singularity/fastqc:0.11.9--0.
CreateCondaEnvironmentException:
The 'conda' command is not available inside your singularity container image. Snakemake mounts your conda installation into singularity. Sometimes, this can fail because of shell restrictions. It has been tested to work with docker://ubuntu, but it e.g. fails with docker://bash 
  File "/home/moldach/anaconda3/envs/snakemake/lib/python3.7/site-packages/snakemake/deployment/conda.py", line 247, in create
  File "/home/moldach/anaconda3/envs/snakemake/lib/python3.7/site-packages/snakemake/deployment/conda.py", line 381, in __new__
  File "/home/moldach/anaconda3/envs/snakemake/lib/python3.7/site-packages/snakemake/deployment/conda.py", line 394, in __init__
  File "/home/moldach/anaconda3/envs/snakemake/lib/python3.7/site-packages/snakemake/deployment/conda.py", line 417, in _check

使用 shell: 代替 wrapper:

我将包装器改回 shell 命令:

这是我在提交时遇到的错误 ``:

orkflow defines that rule get_vep_cache is eligible for caching between workflows (use the --cache argument to enable this).
Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cores: 1 (use --cores to define parallelism)
Rules claiming more threads will be scaled down.
Job counts:
    count   jobs
        1   qc_before_trim_r2
        1

[Mon Sep 21 16:32:54 2020]
Job 0: --- Quality check of raw data with FastQC before trimming.

Activating singularity image /home/moldach/wrappers/SUBSET/VEP/.snakemake/singularity/6740cb07e67eae01644839c9767bdca5.simg
^[[33mWARNING:^[[0m Skipping mount /var/singularity/mnt/session/etc/resolv.conf [files]: /etc/resolv.conf doesn't exist in container
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
        LANGUAGE = (unset),
        LC_ALL = (unset),
        LANG = "en_CA.UTF-8"
    are supported and installed on your system.
perl: warning: Falling back to the standard locale ("C").
Skipping '/home/moldach/wrappers/SUBSET/MTG324_SUBSET/MTG324_R2.fastq.gz' which didn't exist, or couldn't be read
Waiting at most 60 seconds for missing files.
MissingOutputException in line 84 of /home/moldach/wrappers/SUBSET/VEP/Snakefile:
Job completed successfully, but some output files are missing. Missing files after 60 seconds:
qc/fastQC/before_trim/MTG324_R2_fastqc.html
qc/fastQC/before_trim/MTG324_R2_fastqc.zip
This might be due to filesystem latency. If that is the case, consider to increase the wait time with --latency-wait.
  File "/home/moldach/anaconda3/envs/snakemake/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 544, in handle_job_success
  File "/home/moldach/anaconda3/envs/snakemake/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 231, in handle_job_success
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message

错误 Skipping '/home/moldach/wrappers/SUBSET/MTG324_SUBSET/MTG324_R2.fastq.gz' which didn't exist, or couldn't be read 具有误导性,因为该文件确实存在...

更新 2

按照建议 Manavalan Gajapathy 我已经消除了在两个不同级别(全局 + 按规则)定义奇点的问题。

现在我只在全局级别使用奇点容器,并通过 --use-conda 使用包装器,它在容器内部创建了 conda 环境:

# Directories------------------------------------------------------------------
configfile: "config.yaml"

# Setting the names of all directories
dir_list = ["REF_DIR", "LOG_DIR", "BENCHMARK_DIR", "QC_DIR", "TRIM_DIR", "ALIGN_DIR", "MARKDUP_DIR", "CALLING_DIR", "ANNOT_DIR"]
dir_names = ["refs", "logs", "benchmarks", "qc", "trimming", "alignment", "mark_duplicates", "variant_calling", "annotation"]
dirs_dict = dict(zip(dir_list, dir_names))

import os
import pandas as pd
# getting the samples information (names, path to r1 & r2) from samples.txt
samples_information = pd.read_csv("samples.txt", sep='\t', index_col=False)
# get a list of the sample names
sample_names = list(samples_information['sample'])
sample_locations = list(samples_information['location'])
samples_dict = dict(zip(sample_names, sample_locations))
# get number of samples
len_samples = len(sample_names)


# Singularity with conda wrappers

singularity: "docker://continuumio/miniconda3:4.5.11"

# Rules -----------------------------------------------------------------------

rule all:
    input:
    "resources/vep/plugins",
        "resources/vep/cache",
        expand('{QC_DIR}/{QC_TOOL}/before_trim/{sample}_{pair}_fastqc.{ext}', QC_DIR=dirs_dict["QC_DIR"], QC_TOOL=config["QC_TOOL"], sample=sample_names, pair=['R1', 'R2'], ext=['html', 'zip'])

rule download_vep_plugins:
    output:
    directory("resources/vep/plugins")
    params:
    release=100
    resources:
    mem=1000,
        time=30
    wrapper:
    "0.66.0/bio/vep/plugins"

rule get_vep_cache:
    output:
    directory("resources/vep/cache")
    params:
    species="caenorhabditis_elegans",
        build="WBcel235",
        release="100"
    resources:
    mem=1000,
        time=30
    log:
        "logs/vep/cache.log"
    cache: True  # save space and time with between workflow caching (see docs)
    wrapper:
    "0.66.0/bio/vep/cache"

def getHome(sample):
  return(list(os.path.join(samples_dict[sample],"{0}_{1}.fastq.gz".format(sample,pair)) for pair in ['R1','R2']))

rule qc_before_trim_r1:
    input:
    r1=lambda wildcards: getHome(wildcards.sample)[0]
    output:
    html=os.path.join(dirs_dict["QC_DIR"],config["QC_TOOL"],"before_trim","{sample}_R1_fastqc.html"),
        zip=os.path.join(dirs_dict["QC_DIR"],config["QC_TOOL"],"before_trim","{sample}_R1_fastqc.zip"),
    params:
    dir=os.path.join(dirs_dict["QC_DIR"],config["QC_TOOL"],"before_trim")
    log:
        os.path.join(dirs_dict["LOG_DIR"],config["QC_TOOL"],"{sample}_R1.log")
    resources:
    mem=1000,
    threads: 1
    message: """--- Quality check of raw data with FastQC before trimming."""
    wrapper:
    "0.66.0/bio/fastqc"

rule qc_before_trim_r2:
    input:
    r1=lambda wildcards: getHome(wildcards.sample)[1]
    output:
    html=os.path.join(dirs_dict["QC_DIR"],config["QC_TOOL"],"before_trim","{sample}_R2_fastqc.html"),
        zip=os.path.join(dirs_dict["QC_DIR"],config["QC_TOOL"],"before_trim","{sample}_R2_fastqc.zip"),
    params:
    dir=os.path.join(dirs_dict["QC_DIR"],config["QC_TOOL"],"before_trim")
    log:
        os.path.join(dirs_dict["LOG_DIR"],config["QC_TOOL"],"{sample}_R2.log")
    resources:
    mem=1000,
        time=30
    threads: 1
    message: """--- Quality check of raw data with FastQC before trimming."""
    wrapper:
    "0.66.0/bio/fastqc"

并通过以下方式提交:

但是,我仍然遇到错误:

Workflow defines that rule get_vep_cache is eligible for caching between workflows (use the --cache argument to enable this).
Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cores: 1 (use --cores to define parallelism)
Rules claiming more threads will be scaled down.
Job counts:
    count   jobs
        1   qc_before_trim_r2
        1

[Tue Sep 22 12:44:03 2020]
Job 0: --- Quality check of raw data with FastQC before trimming.

Activating singularity image /home/moldach/wrappers/SUBSET/OMG/.snakemake/singularity/d7617773b315c3abcb29e0484085ed06.simg
Activating conda environment: /home/moldach/wrappers/SUBSET/OMG/.snakemake/conda/c591f288
Skipping '/work/mtgraovac_lab/MATTS_SCRATCH/rep1_R2.fastq.gz' which didn't exist, or couldn't be read
Skipping ' 2> logs/fastQC/rep1_R2.log' which didn't exist, or couldn't be read
Failed to process qc/fastQC/before_trim
java.io.FileNotFoundException: qc/fastQC/before_trim (Is a directory)
        at java.base/java.io.FileInputStream.open0(Native Method)
        at java.base/java.io.FileInputStream.open(FileInputStream.java:219)
        at java.base/java.io.FileInputStream.<init>(FileInputStream.java:157)
        at uk.ac.babraham.FastQC.Sequence.FastQFile.<init>(FastQFile.java:73)
        at uk.ac.babraham.FastQC.Sequence.SequenceFactory.getSequenceFile(SequenceFactory.java:106)
        at uk.ac.babraham.FastQC.Sequence.SequenceFactory.getSequenceFile(SequenceFactory.java:62)
        at uk.ac.babraham.FastQC.Analysis.OfflineRunner.processFile(OfflineRunner.java:159)
        at uk.ac.babraham.FastQC.Analysis.OfflineRunner.<init>(OfflineRunner.java:121)
        at uk.ac.babraham.FastQC.FastQCApplication.main(FastQCApplication.java:316)
Traceback (most recent call last):
  File "/home/moldach/wrappers/SUBSET/OMG/.snakemake/scripts/tmpiwwprg5m.wrapper.py", line 35, in <module>
    shell(
  File "/mnt/snakemake/snakemake/shell.py", line 205, in __new__
    raise sp.CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command 'set -euo pipefail;  fastqc qc/fastQC/before_trim --quiet -t 1 --outdir /tmp/tmps93snag8 /work/mtgraovac_lab/MATTS_SCRATCH/rep1_R2.fastq.gz ' 2> logs/fastQC/rep1_R$
[Tue Sep 22 12:44:16 2020]
Error in rule qc_before_trim_r2:
    jobid: 0
    output: qc/fastQC/before_trim/rep1_R2_fastqc.html, qc/fastQC/before_trim/rep1_R2_fastqc.zip
    log: logs/fastQC/rep1_R2.log (check log file(s) for error message)
    conda-env: /home/moldach/wrappers/SUBSET/OMG/.snakemake/conda/c591f288

RuleException:
CalledProcessError in line 97 of /home/moldach/wrappers/SUBSET/OMG/Snakefile:
Command ' singularity exec --home /home/moldach/wrappers/SUBSET/OMG  --bind /home/moldach/anaconda3/envs/snakemake/lib/python3.7/site-packages:/mnt/snakemake /home/moldach/wrappers/SUBSET/OMG/.snakemake$
  File "/home/moldach/anaconda3/envs/snakemake/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 2189, in run_wrapper
  File "/home/moldach/wrappers/SUBSET/OMG/Snakefile", line 97, in __rule_qc_before_trim_r2
  File "/home/moldach/anaconda3/envs/snakemake/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 529, in _callback
  File "/home/moldach/anaconda3/envs/snakemake/lib/python3.7/concurrent/futures/thread.py", line 57, in run
  File "/home/moldach/anaconda3/envs/snakemake/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 515, in cached_or_run
  File "/home/moldach/anaconda3/envs/snakemake/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 2201, in run_wrapper
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message

再现性

要复制它,您可以下载这个小数据集:

git clone https://github.com/CRG-CNAG/CalliNGS-NF.git
cp CalliNGS-NF/data/reads/rep1_*.fq.gz .
mv rep1_1.fq.gz rep1_R1.fastq.gz 
mv rep1_2.fq.gz rep1_R2.fastq.gz 

更新 3:绑定坐骑

根据在 mounting 上分享的 link:

"By default Singularity bind mounts /home/$USER, /tmp, and $PWD into your container at runtime."

因此,为了简单起见(也因为我在使用 --singularity-args 时出错),我已将所需文件移至 /home/$USER 并尝试从那里 运行。

(snakemake) [~]$ pwd
/home/moldach


(snakemake) [~]$ ll
total 3656
drwx------ 26 moldach moldach    4096 Aug 27 17:36 anaconda3
drwx------  2 moldach moldach    4096 Sep 22 10:11 bin
-rw-------  1 moldach moldach     265 Sep 22 14:29 config.yaml
-rw-------  1 moldach moldach 1817903 Sep 22 14:29 rep1_R1.fastq.gz
-rw-------  1 moldach moldach 1870497 Sep 22 14:29 rep1_R2.fastq.gz
-rw-------  1 moldach moldach      55 Sep 22 14:29 samples.txt
-rw-------  1 moldach moldach    3420 Sep 22 14:29 Snakefile

和运行与bash -c "nohup snakemake --profile slurm --use-singularity --use-conda --jobs 4 &"

但是,我仍然遇到这个奇怪的错误:

Activating conda environment: /home/moldach/.snakemake/conda/fdae4f0d
Skipping ' 2> logs/fastQC/rep1_R2.log' which didn't exist, or couldn't be read
Failed to process qc/fastQC/before_trim
java.io.FileNotFoundException: qc/fastQC/before_trim (Is a directory)
        at java.base/java.io.FileInputStream.open0(Native Method)
        at java.base/java.io.FileInputStream.open(FileInputStream.java:219)
        at java.base/java.io.FileInputStream.<init>(FileInputStream.java:157)
        at uk.ac.babraham.FastQC.Sequence.FastQFile.<init>(FastQFile.java:73)
        at uk.ac.babraham.FastQC.Sequence.SequenceFactory.getSequenceFile(SequenceFactory.java:106)
        at uk.ac.babraham.FastQC.Sequence.SequenceFactory.getSequenceFile(SequenceFactory.java:62)
        at uk.ac.babraham.FastQC.Analysis.OfflineRunner.processFile(OfflineRunner.java:159)
        at uk.ac.babraham.FastQC.Analysis.OfflineRunner.<init>(OfflineRunner.java:121)
        at uk.ac.babraham.FastQC.FastQCApplication.main(FastQCApplication.java:316)
Traceback (most recent call last):

为什么它认为它被赋予了一个目录?

Note: If you submit only with --use-conda, e.g. bash -c "nohup snakemake --profile slurm --use-conda --jobs 4 &" there is no error from the fastqc rules. However, the --use-conda param alone is not %100 reproducible, case-in-point doesn't work on another HPC I tested it on

The full log in nohup.out when using --printshellcmds can be found at this gist

TLDR:

qc 规则中使用的 fastqc 奇点容器可能没有 conda 可用,这不满足 snakemake 的--use-conda 期望。

解释:

您在两个不同级别定义了奇点容器 - 1. 将用于所有规则的全局级别,除非它们在规则级别被覆盖; 2. per-rule 将在规则级别使用的级别。

# global singularity container to use
singularity: "docker://continuumio/miniconda3:4.5.11"

# singularity container defined at rule level
rule qc_before_trim_r1:
    ....
    ....
    singularity:
        "https://depot.galaxyproject.org/singularity/fastqc:0.11.9--0"

当您同时使用 --use-singularity--use-conda 时,作业将在奇点容器内的 conda 环境中 运行。因此 conda 命令需要在奇点容器内可用才能实现。虽然您的 global-level 容器显然满足了此要求,但我很确定(虽然尚未测试)您的 fastqc 容器并非如此。

如果提供 --use-conda 标志,snakemake 的工作方式将根据 --use-singularity 标志的提供,在本地或容器内创建 conda 环境。由于您使用 snakemake-wrapper 作为 qc 规则,并且它带有 conda env 配方 pre-defined,这里最简单的解决方案是只对所有规则使用全局定义的 miniconda 容器。即qc规则不需要使用fastqc特定容器。

如果你真的想使用 fastqc 容器,那么你不应该使用 --use-conda 标志,但这当然意味着所有必要的工具都可以从全局定义的容器中获得,或者每个规则。