使用 wildcards/dictionary 将两条规则合并为一条规则
Merging two rules to one rule with wildcards/dictionary
我正在绘制和计算来自两个物种的混合序列。
我的管道中有两条规则,因为每个物种都有自己的 GTF 文件。
我希望能够将其合并为一个规则,然后让 specie 通配符指定使用哪个 GTF 文件(例如使用字典)。
这些是规则:
rule count_reads_human:
input:
ibam = "sorted_reads/human/{spc}_sorted.bam",
gtf = "/nadata/users/username/annotations/hg38/gencode.v38.annotation.gtf"
output:
bam = "counted_reads/human/{spc}_countBAM.bam",
htout = "counted_reads/human/{spc}_htseq_count.out"
log:
"logs/count_reads/human/{spc}.log"
shell:
"htseq-count -o {output.bam} -p bam {input.ibam} {input.gtf} > {output.htout} 2> {log}"
rule count_reads_mouse:
input:
ibam = "sorted_reads/mouse/{spc}_sorted.bam",
gtf = "/nadata/users/username/annotations/mm10/gencode.vM20.annotation.gtf"
output:
bam = "counted_reads/mouse/{spc}_countBAM.bam",
htout = "counted_reads/mouse/{spc}_htseq_count.out"
log:
"logs/count_reads/mouse/{spc}.log"
shell:
"htseq-count -o {output.bam} -p bam {input.ibam} {input.gtf} > {output.htout} 2> {log}"
我想做的事情是这样的:
gtfDict = {"human": "path/to/human/gtf/file.gtf", "mouse": "/path/to/mouse/gtf/file.gtf"}
rule count_reads:
input:
ibam = "sorted_reads/{specie}/{spc}_sorted.bam",
gtf = gtfDict[{specie}]
output:
bam = "counted_reads/{specie}/{spc}_countBAM.bam",
htout = "counted_reads/{specie}/{spc}_htseq_count.out"
log:
"logs/count_reads/{specie}/{spc}.log"
shell:
"htseq-count -o {output.bam} -p bam {input.ibam} {input.gtf} > {output.htout} 2> {log}"
但是我的合并规则不起作用。
想通了。需要使用 lambda:
gtfDict = {"human": "/path/to/human.gtf",
"mouse": "/path/to/mouse.gtf"}
rule count_reads:
input:
ibam = "sorted_reads/{specie}/{spc}_sorted.bam",
gtf = lambda wildcards: gtfDict[wildcards.specie]
output:
bam = "counted_reads/{specie}/{spc}_countBAM.bam",
htout = "counted_reads/{specie}/{spc}_htseq_count.out"
log:
"logs/count_reads/{specie}/{spc}.log"
shell:
"htseq-count -o {output.bam} -p bam {input.ibam} {input.gtf} > {output.htout} 2> {log}"
我正在绘制和计算来自两个物种的混合序列。 我的管道中有两条规则,因为每个物种都有自己的 GTF 文件。 我希望能够将其合并为一个规则,然后让 specie 通配符指定使用哪个 GTF 文件(例如使用字典)。 这些是规则:
rule count_reads_human:
input:
ibam = "sorted_reads/human/{spc}_sorted.bam",
gtf = "/nadata/users/username/annotations/hg38/gencode.v38.annotation.gtf"
output:
bam = "counted_reads/human/{spc}_countBAM.bam",
htout = "counted_reads/human/{spc}_htseq_count.out"
log:
"logs/count_reads/human/{spc}.log"
shell:
"htseq-count -o {output.bam} -p bam {input.ibam} {input.gtf} > {output.htout} 2> {log}"
rule count_reads_mouse:
input:
ibam = "sorted_reads/mouse/{spc}_sorted.bam",
gtf = "/nadata/users/username/annotations/mm10/gencode.vM20.annotation.gtf"
output:
bam = "counted_reads/mouse/{spc}_countBAM.bam",
htout = "counted_reads/mouse/{spc}_htseq_count.out"
log:
"logs/count_reads/mouse/{spc}.log"
shell:
"htseq-count -o {output.bam} -p bam {input.ibam} {input.gtf} > {output.htout} 2> {log}"
我想做的事情是这样的:
gtfDict = {"human": "path/to/human/gtf/file.gtf", "mouse": "/path/to/mouse/gtf/file.gtf"}
rule count_reads:
input:
ibam = "sorted_reads/{specie}/{spc}_sorted.bam",
gtf = gtfDict[{specie}]
output:
bam = "counted_reads/{specie}/{spc}_countBAM.bam",
htout = "counted_reads/{specie}/{spc}_htseq_count.out"
log:
"logs/count_reads/{specie}/{spc}.log"
shell:
"htseq-count -o {output.bam} -p bam {input.ibam} {input.gtf} > {output.htout} 2> {log}"
但是我的合并规则不起作用。
想通了。需要使用 lambda:
gtfDict = {"human": "/path/to/human.gtf",
"mouse": "/path/to/mouse.gtf"}
rule count_reads:
input:
ibam = "sorted_reads/{specie}/{spc}_sorted.bam",
gtf = lambda wildcards: gtfDict[wildcards.specie]
output:
bam = "counted_reads/{specie}/{spc}_countBAM.bam",
htout = "counted_reads/{specie}/{spc}_htseq_count.out"
log:
"logs/count_reads/{specie}/{spc}.log"
shell:
"htseq-count -o {output.bam} -p bam {input.ibam} {input.gtf} > {output.htout} 2> {log}"