Snakemake 将 vcf 文件的第一个基因型作为输出中的通配符
Snakemake first genotype of a vcf file as wildcard in output
在第二条规则中,我想 select 从包含 bob、clara 和 tim 的 vcf 文件中,仅将 roder 中字典的第一个基因型(即 bob)作为第二条规则中的输出 bob.dn.vcf
。这在 snakemake
中可行吗?
d = {"FAM1": ["bob.bam", "clara.bam", "tim.bam"]}
FAMILIES = list(d)
rule all:
input:
expand some outputs
wildcard_constraints:
family = "|".join(FAMILIES)
rule somerulename:
input:
lambda w: d[w.family]
output:
vcf="{family}/{family}.vcf"
shell:
"""
some python command line which produces a single vcf file with bob, clara and tim
"""
rule somerulename:
input:
invcf="{family}/{family}.vcf"
params:
ref="someref.fasta"
output:
out="{family}/{bob}.dn.vcf"
shell:
"""
gatk --java-options "-Xms2G -Xmx2g -XX:ParallelGCThreads=2" SelectVariants -R {params.ref} -V {input.invcf} -O {output.out}
"""
至少有两个选项:
- 明确指定输出:
rule somerulename:
output:
out="FAM1/bob.dn.vcf"
- 对通配符值施加约束:
rule somerulename:
output:
out="{family}/{bob}.dn.vcf"
wildcard_constraints:
family="FAM1",
bob="bob",
- 通过为规则指定适当的输入来控制生成的内容
all
:
rule all:
input: "FAM1/bob.dn.vcf", "FAM2/alice.dn.vcf"
在第二条规则中,我想 select 从包含 bob、clara 和 tim 的 vcf 文件中,仅将 roder 中字典的第一个基因型(即 bob)作为第二条规则中的输出 bob.dn.vcf
。这在 snakemake
中可行吗?
d = {"FAM1": ["bob.bam", "clara.bam", "tim.bam"]}
FAMILIES = list(d)
rule all:
input:
expand some outputs
wildcard_constraints:
family = "|".join(FAMILIES)
rule somerulename:
input:
lambda w: d[w.family]
output:
vcf="{family}/{family}.vcf"
shell:
"""
some python command line which produces a single vcf file with bob, clara and tim
"""
rule somerulename:
input:
invcf="{family}/{family}.vcf"
params:
ref="someref.fasta"
output:
out="{family}/{bob}.dn.vcf"
shell:
"""
gatk --java-options "-Xms2G -Xmx2g -XX:ParallelGCThreads=2" SelectVariants -R {params.ref} -V {input.invcf} -O {output.out}
"""
至少有两个选项:
- 明确指定输出:
rule somerulename:
output:
out="FAM1/bob.dn.vcf"
- 对通配符值施加约束:
rule somerulename:
output:
out="{family}/{bob}.dn.vcf"
wildcard_constraints:
family="FAM1",
bob="bob",
- 通过为规则指定适当的输入来控制生成的内容
all
:
rule all:
input: "FAM1/bob.dn.vcf", "FAM2/alice.dn.vcf"