Snakemake 出错 'Unexpected keyword expand in rule definition'
Error in Snakemake 'Unexpected keyword expand in rule definition'
如标题所示,我的 Snakefile 为 all 规则中的 expand 函数提供了 SyntaxError。我知道这通常是由 whitespace/indentation 错误引起的,但是我已经确认文件中没有选项卡。我已经删除了每个空格并使用 grep 搜索了文件。我很感激任何建议。
错误信息:
SyntaxError in line 14 of /PATH/to/Snakefile:
Unexpected keyword expand in rule definition (Snakefile, line 14)
代码:
from glob import glob
from numpy import unique
reads = glob('{}/*'.format(config['readDir']))
samples = []
for i in reads:
sampleName = i.replace('{}/'.format(config['readDir']), '')
sampleName = sampleName.replace('{}'.format(config['readSuffix1']), '')
sampleName = sampleName.replace('{}'.format(config['readSuffix2']), '')
samples.append(sampleName)
samples = unique(samples)
rule all:
expand('fastqc/{sample}_1_fastqc.html', sample=samples),
expand('gene_count/{sample}.count', sample=samples)
rule fastqc:
input:
r1 = config['readDir'] + '/{sample}' + config['readSuffix1'],
r2 = config['readDir'] + '/{sample}' + config['readSuffix2']
output:
o1 = 'fastqc/{sample}_1_fastqc.html',
o2 = 'fastqc/{sample}_2_fastqc.html'
params:
'fastqc'
shell:
'fastqc {input.r1} {input.r2} -o {params}'
rule trim:
input:
r1 = config['readDir'] + '/{sample}' + config['readSuffix1'],
r2 = config['readDir'] + '/{sample}' + config['readSuffix2']
output:
'trimmed_reads/{sample}_val_1.fq',
'trimmed_reads/{sample}_val_2.fq'
params:
outDir = 'trimmed_reads',
suffix = '{sample}',
minPhred = config['minPhred'],
minOverlap = config['minOverlap']
shell:
'trim_galore --paired --quality {params.minPhred} '
'--stringency {params.minOverlap} --basename {params.suffix} '
'--output_dir {params.outDir} {input.r1} {input.r2}'
rule align:
input:
r1 = 'trimmed_reads/{sample}_val_1.fq',
r2 = 'trimmed_reads/{sample}_val_2.fq'
output:
sam = temp('aligned_reads/{sample}.sam'),
bam = 'aligned_reads/{sample}.bam'
params:
ref = config['hisatRef']
threads:
config['threads']
log:
'logs/{sample}_hisat2.log'
shell:
'hisat2 --dta -p {threads} -x {params.ref} '
'-1 {input.r1} -2 {input.r2} -S {output.sam} 2> {log}; '
'samtools sort -@ {threads} -o {output.bam} {output.sam}; '
rule sort_name:
input:
'aligned_reads/{sample}.bam'
output:
bam = temp('aligned_reads/{sample}_name_sorted.bam'),
index = temp('aligned_reads/{sample}_name_sorted.bam.bai')
threads:
config['threads']
shell:
'samtools sort -n -@ {threads} -o {output.bam} {input}; '
rule count:
input:
bam = 'aligned_reads/{sample}.bam'
output:
'gene_count/{sample}.count'
params:
annotations = config['annotations'],
minMapq = config['minMapq'],
stranded = config['stranded']
shell:
'htseq-count -s {params.stranded} -a {params.minMapq} '
'--additional_attr=gene_name --additional_attr=gene_type '
'{input.bam} {params.annotations} > {output}'
这是 python 的错误,因为规则 all
有两个函数,用逗号分隔。在这种情况下,第二个扩展调用会导致错误。您可以将 ,
替换为 +
以解决如下所示的错误。
expand('fastqc/{sample}_1_fastqc.html', sample=samples) + expand('gene_count/{sample}.count', sample=samples)
您也可以将两者组合成一个扩展函数,如下所示
expand(['fastqc/{sample}_1_fastqc.html', 'gene_count/{sample}.count'], sample=samples)
以下代码将解决这个问题:
rule all:
input:
expand('fastqc/{sample}_1_fastqc.html', sample=samples),
expand('gene_count/{sample}.count', sample=samples)
如标题所示,我的 Snakefile 为 all 规则中的 expand 函数提供了 SyntaxError。我知道这通常是由 whitespace/indentation 错误引起的,但是我已经确认文件中没有选项卡。我已经删除了每个空格并使用 grep 搜索了文件。我很感激任何建议。
错误信息:
SyntaxError in line 14 of /PATH/to/Snakefile:
Unexpected keyword expand in rule definition (Snakefile, line 14)
代码:
from glob import glob
from numpy import unique
reads = glob('{}/*'.format(config['readDir']))
samples = []
for i in reads:
sampleName = i.replace('{}/'.format(config['readDir']), '')
sampleName = sampleName.replace('{}'.format(config['readSuffix1']), '')
sampleName = sampleName.replace('{}'.format(config['readSuffix2']), '')
samples.append(sampleName)
samples = unique(samples)
rule all:
expand('fastqc/{sample}_1_fastqc.html', sample=samples),
expand('gene_count/{sample}.count', sample=samples)
rule fastqc:
input:
r1 = config['readDir'] + '/{sample}' + config['readSuffix1'],
r2 = config['readDir'] + '/{sample}' + config['readSuffix2']
output:
o1 = 'fastqc/{sample}_1_fastqc.html',
o2 = 'fastqc/{sample}_2_fastqc.html'
params:
'fastqc'
shell:
'fastqc {input.r1} {input.r2} -o {params}'
rule trim:
input:
r1 = config['readDir'] + '/{sample}' + config['readSuffix1'],
r2 = config['readDir'] + '/{sample}' + config['readSuffix2']
output:
'trimmed_reads/{sample}_val_1.fq',
'trimmed_reads/{sample}_val_2.fq'
params:
outDir = 'trimmed_reads',
suffix = '{sample}',
minPhred = config['minPhred'],
minOverlap = config['minOverlap']
shell:
'trim_galore --paired --quality {params.minPhred} '
'--stringency {params.minOverlap} --basename {params.suffix} '
'--output_dir {params.outDir} {input.r1} {input.r2}'
rule align:
input:
r1 = 'trimmed_reads/{sample}_val_1.fq',
r2 = 'trimmed_reads/{sample}_val_2.fq'
output:
sam = temp('aligned_reads/{sample}.sam'),
bam = 'aligned_reads/{sample}.bam'
params:
ref = config['hisatRef']
threads:
config['threads']
log:
'logs/{sample}_hisat2.log'
shell:
'hisat2 --dta -p {threads} -x {params.ref} '
'-1 {input.r1} -2 {input.r2} -S {output.sam} 2> {log}; '
'samtools sort -@ {threads} -o {output.bam} {output.sam}; '
rule sort_name:
input:
'aligned_reads/{sample}.bam'
output:
bam = temp('aligned_reads/{sample}_name_sorted.bam'),
index = temp('aligned_reads/{sample}_name_sorted.bam.bai')
threads:
config['threads']
shell:
'samtools sort -n -@ {threads} -o {output.bam} {input}; '
rule count:
input:
bam = 'aligned_reads/{sample}.bam'
output:
'gene_count/{sample}.count'
params:
annotations = config['annotations'],
minMapq = config['minMapq'],
stranded = config['stranded']
shell:
'htseq-count -s {params.stranded} -a {params.minMapq} '
'--additional_attr=gene_name --additional_attr=gene_type '
'{input.bam} {params.annotations} > {output}'
这是 python 的错误,因为规则 all
有两个函数,用逗号分隔。在这种情况下,第二个扩展调用会导致错误。您可以将 ,
替换为 +
以解决如下所示的错误。
expand('fastqc/{sample}_1_fastqc.html', sample=samples) + expand('gene_count/{sample}.count', sample=samples)
您也可以将两者组合成一个扩展函数,如下所示
expand(['fastqc/{sample}_1_fastqc.html', 'gene_count/{sample}.count'], sample=samples)
以下代码将解决这个问题:
rule all:
input:
expand('fastqc/{sample}_1_fastqc.html', sample=samples),
expand('gene_count/{sample}.count', sample=samples)