为什么我不能在 snakemake 规则中使用变量名进行拆分
why cant i do split with variable names in snakemake rule
见下文,尽管有 .split() 函数,但我无法理解为什么我的参数 par1 和 par2 相同。
查看这个可运行的自包含示例。
您需要在工作目录中执行“touch id1-one.input id2-two.input”。
files=["id1-one", "id2-two"]
rule all:
input:
expand("{sample}.output",sample=files)
rule myrule:
params:
par1 = "{sample}",
par2 = "{sample}".split("-")
input:
i = "{sample}.input"
output:
o = "{sample}.output"
shell:
"./myprog -i ${input.i} -o {output.o} par1: {params.par1} par2: {params.par2}"
运行 的输出是:
$ snakemake -s small3.smk --cores 10 -n -p
Building DAG of jobs...
Job stats:
job count min threads max threads
------ ------- ------------- -------------
all 1 1 1
myrule 2 1 1
total 3 1 1
[Sat Dec 11 18:59:02 2021]
rule myrule:
input: id2-two.input
output: id2-two.output
jobid: 2
wildcards: sample=id2-two
resources: tmpdir=/var/folders/jb/b9y_67gx3v727w68k7mgrpdm0000gn/T
./myprog -i $id2-two.input -o id2-two.output par1: id2-two par2: id2-two
[Sat Dec 11 18:59:02 2021]
rule myrule:
input: id1-one.input
output: id1-one.output
jobid: 1
wildcards: sample=id1-one
resources: tmpdir=/var/folders/jb/b9y_67gx3v727w68k7mgrpdm0000gn/T
./myprog -i $id1-one.input -o id1-one.output par1: id1-one par2: id1-one
[Sat Dec 11 18:59:02 2021]
localrule all:
input: id1-one.output, id2-two.output
jobid: 0
resources: tmpdir=/var/folders/jb/b9y_67gx3v727w68k7mgrpdm0000gn/T
Job stats:
job count min threads max threads
------ ------- ------------- -------------
all 1 1 1
myrule 2 1 1
total 3 1 1
This was a dry-run (flag -n). The order of jobs does not reflect the order of execution.
你应该在 params
中使用 input function 来得到你想要的:
rule myrule:
params:
params3 = lambda wildcards: wildcards.sample.split("-")
...
shell:
"par1: {params.par1} par2: {params.par2} par3: {params.par3} par3[0]: {params.par3[0]}"
扩展为 id1-one
:
par1: id1-one par2: id1-one par3: id1 one par3[0]: id1
见下文,尽管有 .split() 函数,但我无法理解为什么我的参数 par1 和 par2 相同。
查看这个可运行的自包含示例。
您需要在工作目录中执行“touch id1-one.input id2-two.input”。
files=["id1-one", "id2-two"]
rule all:
input:
expand("{sample}.output",sample=files)
rule myrule:
params:
par1 = "{sample}",
par2 = "{sample}".split("-")
input:
i = "{sample}.input"
output:
o = "{sample}.output"
shell:
"./myprog -i ${input.i} -o {output.o} par1: {params.par1} par2: {params.par2}"
运行 的输出是:
$ snakemake -s small3.smk --cores 10 -n -p
Building DAG of jobs...
Job stats:
job count min threads max threads
------ ------- ------------- -------------
all 1 1 1
myrule 2 1 1
total 3 1 1
[Sat Dec 11 18:59:02 2021]
rule myrule:
input: id2-two.input
output: id2-two.output
jobid: 2
wildcards: sample=id2-two
resources: tmpdir=/var/folders/jb/b9y_67gx3v727w68k7mgrpdm0000gn/T
./myprog -i $id2-two.input -o id2-two.output par1: id2-two par2: id2-two
[Sat Dec 11 18:59:02 2021]
rule myrule:
input: id1-one.input
output: id1-one.output
jobid: 1
wildcards: sample=id1-one
resources: tmpdir=/var/folders/jb/b9y_67gx3v727w68k7mgrpdm0000gn/T
./myprog -i $id1-one.input -o id1-one.output par1: id1-one par2: id1-one
[Sat Dec 11 18:59:02 2021]
localrule all:
input: id1-one.output, id2-two.output
jobid: 0
resources: tmpdir=/var/folders/jb/b9y_67gx3v727w68k7mgrpdm0000gn/T
Job stats:
job count min threads max threads
------ ------- ------------- -------------
all 1 1 1
myrule 2 1 1
total 3 1 1
This was a dry-run (flag -n). The order of jobs does not reflect the order of execution.
你应该在 params
中使用 input function 来得到你想要的:
rule myrule:
params:
params3 = lambda wildcards: wildcards.sample.split("-")
...
shell:
"par1: {params.par1} par2: {params.par2} par3: {params.par3} par3[0]: {params.par3[0]}"
扩展为 id1-one
:
par1: id1-one par2: id1-one par3: id1 one par3[0]: id1