snakemake:docker 内的 MissingOutputException

snakemake: MissingOutputException within docker

我正在尝试使用 snakemake 运行 docker 中的管道。我在使用 sortmerna 工具从 control_merged.fqtreated_merged.fq 输入文件生成 {sample}_merged_sorted_mRNA{sample}_merged_sorted 输出时遇到问题。

这是我的 Snakefile:

   SAMPLES = ["control","treated"]
   for smp in SAMPLES:
       print("Sample " + smp + " will be processed")
  rule final:
       input:
          expand('/output/{sample}_merged.fq', sample=SAMPLES),
          expand('/output/{sample}_merged_sorted', sample=SAMPLES),
          expand('/output/{sample}_merged_sorted_mRNA', sample=SAMPLES),

  rule sortmerna:
       input: '/output/{sample}_merged.fq',

       output: merged_file='/output/{sample}_merged_sorted_mRNA', merged_sorted='/output/{sample}_merged_sorted',

   message: """---SORTING---"""
   shell:
      '''
         sortmerna --ref /usr/share/sortmerna/rRNA_databases/silva-bac-23s-id98.fasta,/    usr/share/sortmerna/rRNA_databases/index/silva-bac-23s-id98: --reads {input} --paired_in     -a 16 --log --fastx --aligned {output.merged_file} --other {output.merged_sorted} -v
     '''

当运行搞清楚这个我得到:

Waiting at most 5 seconds for missing files.                                                 
 MissingOutputException in line 57 of /input/Snakefile:                                       
 Missing files after 5 seconds:
/output/control_merged_sorted_mRNA
/output/control_merged_sorted  

 This might be due to filesystem latency. If that is the case, consider to increase the wait $ime with --latency-wait.

 Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: /input/.snakemake/log/2018-11-05T091643.911334.snakemake.log

我尝试使用 --latency-wait 来增加延迟,但我得到了相同的结果。有趣的是,生成了两个输出文件 control_merged_sorted_mRNA.fqcontrol_merged_sorted.fq,但程序失败并退出。 snakemake的版本是5.3.0。有什么帮助吗?

snakemake 失败,因为没有生成规则 sortmerna 描述的输出。这不是延迟问题,这是您的输出问题。

您的规则 sortmerna 期望输出:
/output/control_merged_sorted_mRNA

/output/control_merged_sorted
但是你正在使用的程序(我对 sortmerna 一无所知)显然正在生产
/output/control_merged_sorted_mRNA.fq

/output/control_merged_sorted.fq
确保当您在程序的命令行上指定选项 --aligned--other 时,它应该是生成的文件的真实名称,或者如果它只是基本名称,程序将添加一个后缀 .fq。如果你是后一种情况,我建议你使用:

rule final:
    input:
      expand('/output/{sample}_merged.fq', sample=SAMPLES),
      expand('/output/{sample}_merged_sorted', sample=SAMPLES),
      expand('/output/{sample}_merged_sorted_mRNA', sample=SAMPLES),

rule sortmerna:
   input: 
       '/output/{sample}_merged.fq',
   output: 
       merged_file='/output/{sample}_merged_sorted_mRNA.fq',
       merged_sorted='/output/{sample}_merged_sorted.fq'
   params: 
       merged_file_basename='/output/{sample}_merged_sorted_mRNA',
       merged_sorted_basename='/output/{sample}_merged_sorted'
   message: """---SORTING---"""
   shell:
       """
       sortmerna --ref /usr/share/sortmerna/rRNA_databases/silva-bac-23s-id98.fasta,/usr/share/sortmerna/rRNA_databases/index/silva-bac-23s-id98: --reads {input} --paired_in     -a 16 --log --fastx --aligned {params.merged_file_basename} --other {params.merged_sorted_basename} -v
       """