在 snakemake 中的多个输入之前添加标志 -I
Add flag -I before multiple input in snakemake
我正在使用 snakemake 来描述 GATK 管道。我需要 运行 以下命令:
./../gatk-4.1.3.0/gatk --java-options ""-Xmx7680m"" GatherBQSRReports -I file1 -I file2 -I file3 -O output
相反,我得到了以下结果:
./../gatk-4.1.3.0/gatk --java-options ""-Xmx7680m"" GatherBQSRReports -I file1 file2 file3 -O output
现在我的代码如下所示:
import glob
SAMPLES = glob_wildcards("input_bam/{sample}.bam")
INTERVALS = glob_wildcards("../intervals/{interval}.bed")
rule all:
input: expand("out/{sample}.AddOrReplaceReadGroups.bam", sample = SAMPLES.sample), expand("out/{sample}.MarkDuplicates.bam", sample=SAMPLES.sample),,expand("out_BaseRecalibrator/{sample}.BaseRecalibrator.{interval}.txt", interval=INTERVALS.interval, sample=SAMPLES.sample), expand("out_BaseRecalibrator/{sample}.GatherBQSRReports.txt", sample=SAMPLES.sample)
rule gatk_AddOrReplaceReadGroups:
input: "input_bam/{sample}.bam"
output: "out/{sample}.AddOrReplaceReadGroups.bam"
shell: "./../gatk-4.1.3.0/gatk --java-options ""-Xmx30g"" AddOrReplaceReadGroups -I {input} -O {output} -ID group1 -SM TUMOR -PL illumina -LB lib1 -PU unit1"
rule gatk_MarkDuplicates:
input: rules.gatk_AddOrReplaceReadGroups.output
output: output1="out/{sample}.MarkDuplicates.bam", output2="out/{sample}.MarkDuplicates.txt"
shell: "./../gatk-4.1.3.0/gatk --java-options ""-Xmx4g"" MarkDuplicates -I {input} -O {output.output1} -M {output.output2} --CREATE_INDEX true"
rule gatk_BaseRecalibrator:
input: input1="../intervals/{interval}.bed", input2="out/{sample}.MarkDuplicates.bam", input3='../ref/ref.fa', input4="../dbsnp/dbsnp_150.hg38.vcf.gz"
output: "out_BaseRecalibrator/{sample}.BaseRecalibrator.{interval}.txt"
shell: "./../gatk-4.1.3.0/gatk --java-options ""-Xmx7680m"" BaseRecalibrator -L {input.input1} -I {input.input2} -O {output} -R {input.input3} --known-sites {input.input4}"
rule gatk_GatherBQSRReports:
input: expand("out_BaseRecalibrator/{sample}.BaseRecalibrator.{interval}.txt", interval=INTERVALS.interval, sample=SAMPLES.sample)
output: "out_BaseRecalibrator/{sample}.GatherBQSRReports.txt"
shell: "./../gatk-4.1.3.0/gatk --java-options ""-Xmx7680m"" GatherBQSRReports -I {input} -O {output}"
我是这个程序的新手,不会编写所需的参数。
我很乐意提供任何帮助
你可以这样做:
reports = expand("out_BaseRecalibrator/{sample}.BaseRecalibrator.{interval}.txt", interval=INTERVALS.interval, sample=SAMPLES.sample)
rule gatk_GatherBQSRReports:
input:
reports,
params:
reports= '-I ' + ' -I '.join(reports),
output:
...
shell:
r"""
./../gatk-4.1.3.0/gatk --java-options -Xmx7680m GatherBQSRReports {params.reports} -O {output}
"""
我正在使用 snakemake 来描述 GATK 管道。我需要 运行 以下命令:
./../gatk-4.1.3.0/gatk --java-options ""-Xmx7680m"" GatherBQSRReports -I file1 -I file2 -I file3 -O output
相反,我得到了以下结果:
./../gatk-4.1.3.0/gatk --java-options ""-Xmx7680m"" GatherBQSRReports -I file1 file2 file3 -O output
现在我的代码如下所示:
import glob
SAMPLES = glob_wildcards("input_bam/{sample}.bam")
INTERVALS = glob_wildcards("../intervals/{interval}.bed")
rule all:
input: expand("out/{sample}.AddOrReplaceReadGroups.bam", sample = SAMPLES.sample), expand("out/{sample}.MarkDuplicates.bam", sample=SAMPLES.sample),,expand("out_BaseRecalibrator/{sample}.BaseRecalibrator.{interval}.txt", interval=INTERVALS.interval, sample=SAMPLES.sample), expand("out_BaseRecalibrator/{sample}.GatherBQSRReports.txt", sample=SAMPLES.sample)
rule gatk_AddOrReplaceReadGroups:
input: "input_bam/{sample}.bam"
output: "out/{sample}.AddOrReplaceReadGroups.bam"
shell: "./../gatk-4.1.3.0/gatk --java-options ""-Xmx30g"" AddOrReplaceReadGroups -I {input} -O {output} -ID group1 -SM TUMOR -PL illumina -LB lib1 -PU unit1"
rule gatk_MarkDuplicates:
input: rules.gatk_AddOrReplaceReadGroups.output
output: output1="out/{sample}.MarkDuplicates.bam", output2="out/{sample}.MarkDuplicates.txt"
shell: "./../gatk-4.1.3.0/gatk --java-options ""-Xmx4g"" MarkDuplicates -I {input} -O {output.output1} -M {output.output2} --CREATE_INDEX true"
rule gatk_BaseRecalibrator:
input: input1="../intervals/{interval}.bed", input2="out/{sample}.MarkDuplicates.bam", input3='../ref/ref.fa', input4="../dbsnp/dbsnp_150.hg38.vcf.gz"
output: "out_BaseRecalibrator/{sample}.BaseRecalibrator.{interval}.txt"
shell: "./../gatk-4.1.3.0/gatk --java-options ""-Xmx7680m"" BaseRecalibrator -L {input.input1} -I {input.input2} -O {output} -R {input.input3} --known-sites {input.input4}"
rule gatk_GatherBQSRReports:
input: expand("out_BaseRecalibrator/{sample}.BaseRecalibrator.{interval}.txt", interval=INTERVALS.interval, sample=SAMPLES.sample)
output: "out_BaseRecalibrator/{sample}.GatherBQSRReports.txt"
shell: "./../gatk-4.1.3.0/gatk --java-options ""-Xmx7680m"" GatherBQSRReports -I {input} -O {output}"
我是这个程序的新手,不会编写所需的参数。 我很乐意提供任何帮助
你可以这样做:
reports = expand("out_BaseRecalibrator/{sample}.BaseRecalibrator.{interval}.txt", interval=INTERVALS.interval, sample=SAMPLES.sample)
rule gatk_GatherBQSRReports:
input:
reports,
params:
reports= '-I ' + ' -I '.join(reports),
output:
...
shell:
r"""
./../gatk-4.1.3.0/gatk --java-options -Xmx7680m GatherBQSRReports {params.reports} -O {output}
"""