Snakemake 出错 'Unexpected keyword expand in rule definition'

Error in Snakemake 'Unexpected keyword expand in rule definition'

如标题所示,我的 Snakefile 为 all 规则中的 expand 函数提供了 SyntaxError。我知道这通常是由 whitespace/indentation 错误引起的,但是我已经确认文件中没有选项卡。我已经删除了每个空格并使用 grep 搜索了文件。我很感激任何建议。
错误信息:

SyntaxError in line 14 of /PATH/to/Snakefile:
Unexpected keyword expand in rule definition (Snakefile, line 14)

代码:

from glob import glob
from numpy import unique

reads = glob('{}/*'.format(config['readDir']))
samples = []
for i in reads:
  sampleName = i.replace('{}/'.format(config['readDir']), '')
  sampleName = sampleName.replace('{}'.format(config['readSuffix1']), '')
  sampleName = sampleName.replace('{}'.format(config['readSuffix2']), '')
  samples.append(sampleName)
samples = unique(samples)

rule all:
  expand('fastqc/{sample}_1_fastqc.html', sample=samples),
  expand('gene_count/{sample}.count', sample=samples)

rule fastqc:
  input:
    r1 = config['readDir'] + '/{sample}' + config['readSuffix1'],
    r2 = config['readDir'] + '/{sample}' + config['readSuffix2']
  output:
    o1 = 'fastqc/{sample}_1_fastqc.html',
    o2 = 'fastqc/{sample}_2_fastqc.html'
  params:
    'fastqc'
  shell:
    'fastqc {input.r1} {input.r2} -o {params}'

rule trim:
  input:
    r1 = config['readDir'] + '/{sample}' + config['readSuffix1'],
    r2 = config['readDir'] + '/{sample}' + config['readSuffix2']
  output:
    'trimmed_reads/{sample}_val_1.fq',
    'trimmed_reads/{sample}_val_2.fq'
  params:
    outDir = 'trimmed_reads',
    suffix = '{sample}',
    minPhred = config['minPhred'],
    minOverlap = config['minOverlap']
  shell:
    'trim_galore --paired --quality {params.minPhred} '
    '--stringency {params.minOverlap} --basename {params.suffix} '
    '--output_dir {params.outDir} {input.r1} {input.r2}'

rule align:
  input:
    r1 = 'trimmed_reads/{sample}_val_1.fq',
    r2 = 'trimmed_reads/{sample}_val_2.fq'
  output:
    sam = temp('aligned_reads/{sample}.sam'),
    bam = 'aligned_reads/{sample}.bam'
  params:
    ref = config['hisatRef']
  threads:
    config['threads']
  log:
    'logs/{sample}_hisat2.log'
  shell:
    'hisat2 --dta -p {threads} -x {params.ref} '
    '-1 {input.r1} -2 {input.r2} -S {output.sam} 2> {log}; '
    'samtools sort -@ {threads} -o {output.bam} {output.sam}; '

rule sort_name:
  input:
    'aligned_reads/{sample}.bam'
  output:
    bam = temp('aligned_reads/{sample}_name_sorted.bam'),
    index = temp('aligned_reads/{sample}_name_sorted.bam.bai')
  threads:
    config['threads']
  shell:
    'samtools sort -n -@ {threads} -o {output.bam} {input}; '

rule count:
  input:
    bam = 'aligned_reads/{sample}.bam'
  output:
    'gene_count/{sample}.count'
  params:
    annotations = config['annotations'],
    minMapq = config['minMapq'],
    stranded = config['stranded']
  shell:
    'htseq-count -s {params.stranded} -a {params.minMapq} '
    '--additional_attr=gene_name --additional_attr=gene_type '
    '{input.bam} {params.annotations} > {output}'

这是 python 的错误,因为规则 all 有两个函数,用逗号分隔。在这种情况下,第二个扩展调用会导致错误。您可以将 , 替换为 + 以解决如下所示的错误。

expand('fastqc/{sample}_1_fastqc.html', sample=samples) + expand('gene_count/{sample}.count', sample=samples)

您也可以将两者组合成一个扩展函数,如下所示

expand(['fastqc/{sample}_1_fastqc.html', 'gene_count/{sample}.count'], sample=samples)

以下代码将解决这个问题:

rule all:
    input:
        expand('fastqc/{sample}_1_fastqc.html', sample=samples),
        expand('gene_count/{sample}.count', sample=samples)