Snakemake InputFunctionException. AttributeError: 'Wildcards' object has no attribute
Snakemake InputFunctionException. AttributeError: 'Wildcards' object has no attribute
我有一个带有 ChIP-seq 单端 fastq 文件名的列表对象 allfiles=['/path/file1.fastq','/path/file2.fastq','/path/file3.fastq']
。我正在尝试将该对象 allfiles
设置为通配符(我想要 fastqc 规则的输入(以及映射等其他规则,但让我们保持简单)。我尝试了下面代码中的内容(lambda wildcards: data.loc[(wildcards.sample),'read1']
)。但是,这给了我错误
"InputFunctionException in line 118 of Snakefile:
AttributeError: 'Wildcards' object has no attribute 'sample'
Wildcards:
"
有人知道具体怎么定义吗?看起来我很接近,我得到了一般的想法,但我没有得到正确的语法并执行它。谢谢!
代码:
import pandas as pd
import numpy as np
# Read in config file parameters
configfile: 'config.yaml'
sampleFile = config['samples'] # three columns: sample ID , /path/to/chipseq_file_SE.fastq , /path/to/chipseq_input.fastq
outputDir = config['outputdir'] # output directory
outDir = outputDir + "/MyExperiment"
qcDir = outDir + "/QC"
# Read in the samples table
data = pd.read_csv(sampleFile, header=0, names=['sample', 'read1', 'inputs']).set_index('sample', drop=False)
samples = data['sample'].unique().tolist() # sample IDs
read1 = data['read1'].unique().tolist() # ChIP-treatment file single-end file
inplist= data['inputs'].unique().tolist() # the ChIP-input files
inplistUni= data['inputs'].unique().tolist() # the ChIP-input files (unique)
allfiles = read1 + inplistUni
# Target rule
rule all:
input:
expand(f'{qcDir}' + '/raw/{sample}_fastqc.html', sample=samples),
expand(f'{qcDir}' + '/raw/{sample}_fastqc.zip', sample=samples),
# fastqc report generation
rule fastqc:
input: lambda wildcards: data.loc[(wildcards.sample), 'read1']
output:
html=expand(f'{qcDir}' + '/raw/{sample}_fastqc.html',sample=samples) ,
zip=expand(f'{qcDir}' + '/raw/{sample}_fastqc.zip',sample=samples)
log: expand(f'{logDir}' + '/qc/{sample}_fastqc_raw.log',sample=samples)
threads: 4
wrapper: "fastqc {input} 2>> {log}"
目前 rule fastqc
的 output
个文件在解析后没有任何通配符。也就是说,当前在 snakefile 中有一项作业 rule fastqc
尝试为所有样本生成一个输出文件。
但是,您似乎想 运行 rule fastqc
分别为每个样本。在这种情况下,它需要概括如下,其中 {sample}
是通配符:
rule fastqc:
input: lambda wildcards: data.loc[(wildcards.sample), 'read1']
output:
html = qcDir + '/raw/{sample}_fastqc.html,
zip=qcDir + '/raw/{sample}_fastqc.zip'
log: logDir + '/qc/{sample}_fastqc_raw.log'
threads: 4
shell: "fastqc {input} 2>> {log}"
我有一个带有 ChIP-seq 单端 fastq 文件名的列表对象 allfiles=['/path/file1.fastq','/path/file2.fastq','/path/file3.fastq']
。我正在尝试将该对象 allfiles
设置为通配符(我想要 fastqc 规则的输入(以及映射等其他规则,但让我们保持简单)。我尝试了下面代码中的内容(lambda wildcards: data.loc[(wildcards.sample),'read1']
)。但是,这给了我错误
"InputFunctionException in line 118 of Snakefile:
AttributeError: 'Wildcards' object has no attribute 'sample'
Wildcards:
"
有人知道具体怎么定义吗?看起来我很接近,我得到了一般的想法,但我没有得到正确的语法并执行它。谢谢!
代码:
import pandas as pd
import numpy as np
# Read in config file parameters
configfile: 'config.yaml'
sampleFile = config['samples'] # three columns: sample ID , /path/to/chipseq_file_SE.fastq , /path/to/chipseq_input.fastq
outputDir = config['outputdir'] # output directory
outDir = outputDir + "/MyExperiment"
qcDir = outDir + "/QC"
# Read in the samples table
data = pd.read_csv(sampleFile, header=0, names=['sample', 'read1', 'inputs']).set_index('sample', drop=False)
samples = data['sample'].unique().tolist() # sample IDs
read1 = data['read1'].unique().tolist() # ChIP-treatment file single-end file
inplist= data['inputs'].unique().tolist() # the ChIP-input files
inplistUni= data['inputs'].unique().tolist() # the ChIP-input files (unique)
allfiles = read1 + inplistUni
# Target rule
rule all:
input:
expand(f'{qcDir}' + '/raw/{sample}_fastqc.html', sample=samples),
expand(f'{qcDir}' + '/raw/{sample}_fastqc.zip', sample=samples),
# fastqc report generation
rule fastqc:
input: lambda wildcards: data.loc[(wildcards.sample), 'read1']
output:
html=expand(f'{qcDir}' + '/raw/{sample}_fastqc.html',sample=samples) ,
zip=expand(f'{qcDir}' + '/raw/{sample}_fastqc.zip',sample=samples)
log: expand(f'{logDir}' + '/qc/{sample}_fastqc_raw.log',sample=samples)
threads: 4
wrapper: "fastqc {input} 2>> {log}"
目前 rule fastqc
的 output
个文件在解析后没有任何通配符。也就是说,当前在 snakefile 中有一项作业 rule fastqc
尝试为所有样本生成一个输出文件。
但是,您似乎想 运行 rule fastqc
分别为每个样本。在这种情况下,它需要概括如下,其中 {sample}
是通配符:
rule fastqc:
input: lambda wildcards: data.loc[(wildcards.sample), 'read1']
output:
html = qcDir + '/raw/{sample}_fastqc.html,
zip=qcDir + '/raw/{sample}_fastqc.zip'
log: logDir + '/qc/{sample}_fastqc_raw.log'
threads: 4
shell: "fastqc {input} 2>> {log}"