Snakemake:多个文件上的 运行 BWA 时 CalledProcessError

Snakemake : CalledProcessError when running BWA on multiple files

我有一个包含多个子文件夹的文件夹,每个子文件夹都包含我想与基因组比对的 .fastq 文件。我正在尝试为其创建一个 snakemake 工作流程。首先,我使用通配符访问每个子目录及其中的文件。然后我使用 expand 函数存储文件的所有路径并编写规则将文件映射到基因组。代码如下:

    from snakemake.io import glob_wildcards, expand
    import sys
    import os

    directories, files = glob_wildcards("data/samples/{dir}/{file}.fastq")
    print(directories, files)

    rule all:
        input:
             expand("data/samples/{dir}/{file}.fastq", zip, dir=directories, 
    file=files)

    rule bwa_map:
        input:
            G = "data/genome.fa",
            r1 = expand("data/samples/{dir}/{file}.fastq", zip, 
    dir=directories, file=files)
        output:
            r2 = expand("data/results/{dir}/{file}.bam", zip, dir=directories, 
    file=files)
        shell:
           "./bwa mem {input.G} {input.r1} | ./samtools sort -o - > {output.r2}"

但是,当我以 "snakemake bwa_map" 执行此代码时,出现以下错误:

Error in job bwa_map while creating output files data/results/SRR5923/A.bam, data/results/SRR5924/B.bam, data/results/SRR5925/C.bam.
RuleException:
CalledProcessError in line 19 of /Users/rewatitappu/PycharmProjects/RNA-seq_Snakemake/Snakefile:
Command './bwa mem data/genome.fa data/samples/SRR5923/A.fastq data/samples/SRR5924/B.fastq data/samples/SRR5925/C.fastq | ./samtools sort -o - > data/results/SRR5923/A.bam data/results/SRR5924/B.bam data/results/SRR5925/C.bam' returned non-zero exit status 1.
  File "/Users/rewatitappu/PycharmProjects/RNA-seq_Snakemake/Snakefile", line 19, in __rule_bwa_map
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/concurrent/futures/thread.py", line 55, in run
Removing output files of failed job bwa_map since they might be corrupted:
data/results/SRR5923/A.bam
Will exit after finishing currently running jobs.

是我错误地执行了snakemake命令还是代码有问题?

错误消息表明错误发生在执行以下shell命令时:

./bwa mem data/genome.fa data/samples/SRR5923/A.fastq data/samples/SRR5924/B.fastq data/samples/SRR5925/C.fastq | ./samtools sort -o - > data/results/SRR5923/A.bam data/results/SRR5924/B.bam data/results/SRR5925/C.bam

问题可能是因为您有两个 bam 文件作为输出。

您可能不应该在 bwa_map 规则中使用 expand。扩展已在 all 规则中进行。