如何在 snakemake 管道中 运行 bash 脚本
How to run a bash script inside a snakemake pipeline
我想在 snakemake 管道中 运行 一个 bash 脚本。但是我不知道如何在bash脚本中调用snakemake的输入和输出。
贪吃蛇:
rule xxx:
input:
"input.vcf"
output:
"output.tab"
shell:
"""
some_bash.sh {input} {output}
"""
bash 脚本:
#!/bin/bash
paste <(bcftools snakemake@input[0] |\
awk -F"\t" 'BEGIN {print "CHR\tPOS\tID\tREF\tALT\tFILTER"} \
!/^#/ {print "\t""\t""\t""\t""\t"}') \
\
<(bcftools query -f '[\t%SAMPLE=%GT]\n' snakemake@input[0] |\
awk 'BEGIN {print "nHet"} {print gsub(/0\|1|1\|0|0\/1|1\/0/, "")}') \
\
<(bcftools query -f '[\t%SAMPLE=%GT]\n' snakemake@input[0] |\
awk 'BEGIN {print "nHomAlt"} {print gsub(/1\|1|1\/1/, "")}') \
\
<(bcftools query -f '[\t%SAMPLE=%GT]\n' snakemake@input[0] |\
awk 'BEGIN {print "nHomRef"} {print gsub(/0\|0|0\/0/, "")}') \
\
<(bcftools snakemake@input[0] | awk -F"\t" '/^#CHROM/ {split([=13=], header, "\t"); print "HetSamples"} \
!/^#CHROM/ {for (i=10; i<=NF; i++) {if (gsub(/0\|1|1\|0|0\/1|1\/0/, "", $(i))==1) {printf header[i]","}; if (i==NF) {printf "\n"}}}') \
\
<(bcftools snakemake@input[0] | awk -F"\t" '/^#CHROM/ {split([=13=], header, "\t"); print "HomSamplesAlt"} \
!/^#CHROM/ {for (i=10; i<=NF; i++) {if (gsub(/1\|1|1\/1/, "", $(i))==1) {printf header[i]","}; if (i==NF) {printf "\n"}}}') \
\
| sed 's/,\t/\t/g' | sed 's/,$//g' > snakemake@output[0]
我得到的错误:
[E::main] unrecognized command 'snakemake@input[0]'
[E::main] unrecognized command 'snakemake@input[0]'
[E::main] unrecognized command 'snakemake@input[0]'
[E::hts_open_format] [E::hts_open_format] Failed to open file "snakemake@input[0]" : No such file or directoryFailed to open file "snakemake@input[0]" : No such file or directory
您需要使用 bash 语法获取输入参数,snakemake@input[0]
专门用于使用 script
指令的 R 脚本。
特别是,您可以将 snakemake@input[0]
替换为 </code>,它获取 bash 脚本的第一个参数,将 <code>snakemake@output[0]
替换为 </code>,第二个论点。为了安全起见,用双引号括起来以防文件名中有空格,例如<code>""
.
我想在 snakemake 管道中 运行 一个 bash 脚本。但是我不知道如何在bash脚本中调用snakemake的输入和输出。
贪吃蛇:
rule xxx:
input:
"input.vcf"
output:
"output.tab"
shell:
"""
some_bash.sh {input} {output}
"""
bash 脚本:
#!/bin/bash
paste <(bcftools snakemake@input[0] |\
awk -F"\t" 'BEGIN {print "CHR\tPOS\tID\tREF\tALT\tFILTER"} \
!/^#/ {print "\t""\t""\t""\t""\t"}') \
\
<(bcftools query -f '[\t%SAMPLE=%GT]\n' snakemake@input[0] |\
awk 'BEGIN {print "nHet"} {print gsub(/0\|1|1\|0|0\/1|1\/0/, "")}') \
\
<(bcftools query -f '[\t%SAMPLE=%GT]\n' snakemake@input[0] |\
awk 'BEGIN {print "nHomAlt"} {print gsub(/1\|1|1\/1/, "")}') \
\
<(bcftools query -f '[\t%SAMPLE=%GT]\n' snakemake@input[0] |\
awk 'BEGIN {print "nHomRef"} {print gsub(/0\|0|0\/0/, "")}') \
\
<(bcftools snakemake@input[0] | awk -F"\t" '/^#CHROM/ {split([=13=], header, "\t"); print "HetSamples"} \
!/^#CHROM/ {for (i=10; i<=NF; i++) {if (gsub(/0\|1|1\|0|0\/1|1\/0/, "", $(i))==1) {printf header[i]","}; if (i==NF) {printf "\n"}}}') \
\
<(bcftools snakemake@input[0] | awk -F"\t" '/^#CHROM/ {split([=13=], header, "\t"); print "HomSamplesAlt"} \
!/^#CHROM/ {for (i=10; i<=NF; i++) {if (gsub(/1\|1|1\/1/, "", $(i))==1) {printf header[i]","}; if (i==NF) {printf "\n"}}}') \
\
| sed 's/,\t/\t/g' | sed 's/,$//g' > snakemake@output[0]
我得到的错误:
[E::main] unrecognized command 'snakemake@input[0]'
[E::main] unrecognized command 'snakemake@input[0]'
[E::main] unrecognized command 'snakemake@input[0]'
[E::hts_open_format] [E::hts_open_format] Failed to open file "snakemake@input[0]" : No such file or directoryFailed to open file "snakemake@input[0]" : No such file or directory
您需要使用 bash 语法获取输入参数,snakemake@input[0]
专门用于使用 script
指令的 R 脚本。
特别是,您可以将 snakemake@input[0]
替换为 </code>,它获取 bash 脚本的第一个参数,将 <code>snakemake@output[0]
替换为 </code>,第二个论点。为了安全起见,用双引号括起来以防文件名中有空格,例如<code>""
.