snakemake 如何编码对分析

snakemake how to encode pair analisys

我想使用成对样本(肿瘤和正常)进行 gatk 重新校准。我需要使用 pandas 解析数据。这就是我写的。

expand("mapped_reads/merged_samples/{sample[1][tumor]}/{sample[1][tumor]}_{sample[1][normal]}.bam", sample=read_table(config["conditions"], ",").iterrows())

这是条件文件:

432,433
434,435

我写了这条规则:

rule gatk_RealignerTargetCreator:
    input:
          "mapped_reads/merged_samples/{tumor}.sorted.dup.reca.bam",
          "mapped_reads/merged_samples/{normal}.sorted.dup.reca.bam",

    output:
        "mapped_reads/merged_samples/{tumor}/{tumor}_{normal}.realign.intervals"
    params:
        genome=config['reference']['genome_fasta'],
        mills= config['mills'],
        ph1_indels= config['know_phy'],
    log:
        "mapped_reads/merged_samples/logs/{tumor}_{normal}.realign_info.log"
    threads: 8
    shell:
        "gatk -T RealignerTargetCreator -R {params.genome} {params.custom} "
        "-nt {threads} "
        "-I {wildcard.tumor} -I {wildcard.normal}  -known {params.ph1_indels} "
        "-o {output} >& {log}"

我有这个错误:

InputFunctionException in line 17 of /home/maurizio/Desktop/TEST_exome/rules/samfiles.rules:
KeyError: '432/432_433'
Wildcards:
sample=432/432_433

这是 samfiles.rules:

rule samtools_merge_bam:
    """
    Merge bam files for multiple units into one for the given sample.
    If the sample has only one unit, files will be copied.
    """
    input:
        lambda wildcards: expand("mapped_reads/bam/{unit}_sorted.bam",unit=config["samples"][wildcards.sample])
    output:
        "mapped_reads/merged_samples/{sample}.bam"
    benchmark:
        "benchmarks/samtools/merge/{sample}.txt"
    run:
        if len(input) > 1:
            shell("/illumina/software/PROG2/samtools-1.3.1/samtools merge {output} {input}")
        else:
            shell("cp {input} {output} && touch -h {output}")

我只能猜测,因为你没有显示所有相关规则,但我会说错误发生是因为规则 samtools_merge_bam 也适用于以后的 bam 文件,其中你有模式 {tumor}/{tumor}_{normal}...

作为解决方案,您必须解决这种歧义(请参阅 snakemake 教程)。例如,您可以限制 samtools_merge_bam 的通配符不包含任何斜杠。

wildcard_constraints:
    sample="[^/]+"

您可以将约束放在全局或 samtools_merge_bam 规则中。