snakemake 替换问题
snakemake substitution issue
我是 snakemake 的新手,我对以下代码有疑问,应该依次获取 9 个 fastq 文件并应用 fastqc。
smp 应采用以下值:
UG1_S12
UG2_S13
UG3_S14
UR1_S1
UR2_S2
UR3_S3
UY1_S6
UY2_S7
UY3_S8
当我 运行
SAMPLES, = glob_wildcards("reads/merged_s{smp}_L001.fastq.gz")
NB_SAMPLES = len(SAMPLES)
for smp in SAMPLES:
message("Sample " + smp + " will be processed")
message("N= " + str(NB_SAMPLES))
问题是替换下面的 {smp},它首先被 UY2_S7 替换,然后在 mv 命令中被 UY3_S8 替换。
我应该如何确保在同一规则的两个子命令中使用相同的替换?
我当前的代码(inspired by):
SAMPLES, = glob_wildcards("reads/merged_s{smp}_L001.fastq.gz")
rule all:
input:
expand("reads/merged_s{smp}_L001.fastq.gz", smp=SAMPLES),
"results/multiqc.html"
rule fastqc:
"""
Run FastQC on each FASTQ file.
"""
input:
"reads/merged_s{smp}_L001.fastq.gz"
output:
"results/{smp}_fastqc.html",
"intermediate/{smp}_fastqc.zip"
version: "1.0"
shadow: "minimal"
threads: 8
shell:
"""
# Run fastQC and save the output to the current directory
fastqc {input} -t {threads} -q -d . -o .
# Move the files which are used in the workflow
mv merged_s{smp}_L001_fastqc.html {output[0]}
mv merged_s{smp}_L001_fastqc.zip {output[1]}
"""
错误:
Error in rule fastqc:
jobid: 0
output: results/UY2_S7_fastqc.html, intermediate/UY2_S7_fastqc.zip
RuleException:
CalledProcessError in line 60 of Snakefile:
Command ' set -euo pipefail;
# Run fastQC and save the output to the current directory
fastqc reads/merged_sUY2_S7_L001.fastq.gz -t 8 -q -d . -o .
# Move the files which are used in the workflow
mv merged_sUY3_S8_L001_fastqc.html results/UY2_S7_fastqc.html
mv merged_sUY3_S8_L001_fastqc.zip intermediate/UY2_S7_fastqc.zip ' returned non-zero exit status 130.
File "Snakefile", line 60, in __rule_fastqc
File "/opt/biotools/miniconda2/envs/snakemake-tutorial/lib/python3.6/concurrent/futures/thread.py", line 56, in run
如果要在 shell 命令中使用通配符,则必须使用 {wildcards.smp}
.
可能发生的情况是 shell 命令中的 {smp}
取上面 for 循环的最后一次迭代的值。所以改变:
shell:
"""
# Run fastQC and save the output to the current directory
fastqc {input} -t {threads} -q -d . -o .
# Move the files which are used in the workflow
mv merged_s{smp}_L001_fastqc.html {output[0]}
mv merged_s{smp}_L001_fastqc.zip {output[1]}
"""
进入:
shell:
"""
# Run fastQC and save the output to the current directory
fastqc {input} -t {threads} -q -d . -o .
# Move the files which are used in the workflow
mv merged_s{wildcards.smp}_L001_fastqc.html {output[0]}
mv merged_s{wildcards.smp}_L001_fastqc.zip {output[1]}
"""
我还没有检查其余的代码。
我是 snakemake 的新手,我对以下代码有疑问,应该依次获取 9 个 fastq 文件并应用 fastqc。
smp 应采用以下值:
UG1_S12 UG2_S13 UG3_S14 UR1_S1 UR2_S2 UR3_S3 UY1_S6 UY2_S7 UY3_S8
当我 运行
SAMPLES, = glob_wildcards("reads/merged_s{smp}_L001.fastq.gz")
NB_SAMPLES = len(SAMPLES)
for smp in SAMPLES:
message("Sample " + smp + " will be processed")
message("N= " + str(NB_SAMPLES))
问题是替换下面的 {smp},它首先被 UY2_S7 替换,然后在 mv 命令中被 UY3_S8 替换。
我应该如何确保在同一规则的两个子命令中使用相同的替换?
我当前的代码(inspired by):
SAMPLES, = glob_wildcards("reads/merged_s{smp}_L001.fastq.gz")
rule all:
input:
expand("reads/merged_s{smp}_L001.fastq.gz", smp=SAMPLES),
"results/multiqc.html"
rule fastqc:
"""
Run FastQC on each FASTQ file.
"""
input:
"reads/merged_s{smp}_L001.fastq.gz"
output:
"results/{smp}_fastqc.html",
"intermediate/{smp}_fastqc.zip"
version: "1.0"
shadow: "minimal"
threads: 8
shell:
"""
# Run fastQC and save the output to the current directory
fastqc {input} -t {threads} -q -d . -o .
# Move the files which are used in the workflow
mv merged_s{smp}_L001_fastqc.html {output[0]}
mv merged_s{smp}_L001_fastqc.zip {output[1]}
"""
错误:
Error in rule fastqc:
jobid: 0
output: results/UY2_S7_fastqc.html, intermediate/UY2_S7_fastqc.zip
RuleException:
CalledProcessError in line 60 of Snakefile:
Command ' set -euo pipefail;
# Run fastQC and save the output to the current directory
fastqc reads/merged_sUY2_S7_L001.fastq.gz -t 8 -q -d . -o .
# Move the files which are used in the workflow
mv merged_sUY3_S8_L001_fastqc.html results/UY2_S7_fastqc.html
mv merged_sUY3_S8_L001_fastqc.zip intermediate/UY2_S7_fastqc.zip ' returned non-zero exit status 130.
File "Snakefile", line 60, in __rule_fastqc
File "/opt/biotools/miniconda2/envs/snakemake-tutorial/lib/python3.6/concurrent/futures/thread.py", line 56, in run
如果要在 shell 命令中使用通配符,则必须使用 {wildcards.smp}
.
可能发生的情况是 shell 命令中的 {smp}
取上面 for 循环的最后一次迭代的值。所以改变:
shell:
"""
# Run fastQC and save the output to the current directory
fastqc {input} -t {threads} -q -d . -o .
# Move the files which are used in the workflow
mv merged_s{smp}_L001_fastqc.html {output[0]}
mv merged_s{smp}_L001_fastqc.zip {output[1]}
"""
进入:
shell:
"""
# Run fastQC and save the output to the current directory
fastqc {input} -t {threads} -q -d . -o .
# Move the files which are used in the workflow
mv merged_s{wildcards.smp}_L001_fastqc.html {output[0]}
mv merged_s{wildcards.smp}_L001_fastqc.zip {output[1]}
"""
我还没有检查其余的代码。