"Error: No such file or directory" for input that has to be generated first
"Error: No such file or directory" for input that has to be generated first
我有两个 STAR 规则,STAR_genome 为 STAR 规则做了一些索引,因此 STAR 的输入是 STAR_genome 的直接输出 - 到目前为止很简单。但是当我尝试 运行 时,STAR_genome 规则被忽略(未在作业计数中列出)并且我得到以下异常:
FileNotFoundError: [Errno 2] No such file or directory: '[...]STAR/cauliflower/genome/genome.ok'
我不明白为什么 snakemake 会忽略生成规则而只是抱怨缺少文件,因为它甚至采用了应该生成它的规则的路径...
rule STAR_genome:
input: genome=lambda wildcards: config[wildcards.species]["genomefile"]
output: ok=path.join(STAR_DIR, "{species}", "genome", "genome.ok")
threads: 32
envmodules:
config["STAR"][0],
config["STAR"][1]
script:
"scripts/Trinity_GG/STAR_genome.py"
############################################################################
rule STAR:
input:
genome=rules.STAR_genome.output.ok,
r1=rules.trim_galore.output.r1,
r2=rules.trim_galore.output.r2
output:
bam=path.join(STAR_DIR, "{species}_{rep}_Aligned.sortedByCoord.out.bam")
threads: 32
envmodules:
config["STAR"][0],
config["STAR"][1]
script:
"scripts/Trinity_GG/STAR.py"
这里是完整的回溯,以防万一它能有所帮助。
Traceback (most recent call last):
File "/cluster/easybuild/broadwell/software/mflow/0.0-foss-2019b-Python-3.7.4/lib/python3.7/site-packages/snakemake-5.27.4-py3.7.egg/snakemake/__init__.py", line 751, in snakemake
keepmetadata=keep_metadata,
File "/cluster/easybuild/broadwell/software/mflow/0.0-foss-2019b-Python-3.7.4/lib/python3.7/site-packages/snakemake-5.27.4-py3.7.egg/snakemake/workflow.py", line 1000, in execute
success = scheduler.schedule()
File "/cluster/easybuild/broadwell/software/mflow/0.0-foss-2019b-Python-3.7.4/lib/python3.7/site-packages/snakemake-5.27.4-py3.7.egg/snakemake/scheduler.py", line 444, in schedule
run = self.job_selector(needrun)
File "/cluster/easybuild/broadwell/software/mflow/0.0-foss-2019b-Python-3.7.4/lib/python3.7/site-packages/snakemake-5.27.4-py3.7.egg/snakemake/scheduler.py", line 731, in job_selector_greedy
c = list(map(self.job_reward, jobs)) # job rewards
File "/cluster/easybuild/broadwell/software/mflow/0.0-foss-2019b-Python-3.7.4/lib/python3.7/site-packages/snakemake-5.27.4-py3.7.egg/snakemake/scheduler.py", line 814, in job_reward
input_size = job.inputsize
File "/cluster/easybuild/broadwell/software/mflow/0.0-foss-2019b-Python-3.7.4/lib/python3.7/site-packages/snakemake-5.27.4-py3.7.egg/snakemake/jobs.py", line 378, in inputsize
self._inputsize = sum(f.size for f in self.input)
File "/cluster/easybuild/broadwell/software/mflow/0.0-foss-2019b-Python-3.7.4/lib/python3.7/site-packages/snakemake-5.27.4-py3.7.egg/snakemake/jobs.py", line 378, in <genexpr>
self._inputsize = sum(f.size for f in self.input)
File "/cluster/easybuild/broadwell/software/mflow/0.0-foss-2019b-Python-3.7.4/lib/python3.7/site-packages/snakemake-5.27.4-py3.7.egg/snakemake/io.py", line 239, in wrapper
return func(self, *args, **kwargs)
File "/cluster/easybuild/broadwell/software/mflow/0.0-foss-2019b-Python-3.7.4/lib/python3.7/site-packages/snakemake-5.27.4-py3.7.egg/snakemake/io.py", line 254, in wrapper
return func(self, *args, **kwargs)
File "/cluster/easybuild/broadwell/software/mflow/0.0-foss-2019b-Python-3.7.4/lib/python3.7/site-packages/snakemake-5.27.4-py3.7.egg/snakemake/io.py", line 553, in size
return self.size_local
File "/cluster/easybuild/broadwell/software/mflow/0.0-foss-2019b-Python-3.7.4/lib/python3.7/site-packages/snakemake-5.27.4-py3.7.egg/snakemake/io.py", line 558, in size_local
self.check_broken_symlink()
File "/cluster/easybuild/broadwell/software/mflow/0.0-foss-2019b-Python-3.7.4/lib/python3.7/site-packages/snakemake-5.27.4-py3.7.egg/snakemake/io.py", line 563, in check_broken_symlink
if not self.exists_local and os.lstat(self.file):
FileNotFoundError: [Errno 2] No such file or directory: '[...]/Cauliflower_Test/STAR/cauliflower/genome/genome.ok'
在命令 snakemake --debug-dag 的帮助下,我发现我在
的配置调用中出错
input: genome=lambda wildcards: config[wildcards.species]["genomefile"]
正确的名称是“genome_file”,但在任何地方都没有提到,但调试 dag 指出:
candidate job STAR
wildcards: species=cauliflower, rep=A
file [...]/Cauliflower_Test/Alignmentscores/cauliflower_align_rate.txt:
No producers found, but file is present on disk.
Error:
KeyError: 'genomefile'
Wildcards:
species=cauliflower
Traceback:
File "[...]/workflows/Trinity_GG.smk", line 9, in <lambda>
希望这对其他人有帮助,我花了几个小时和很多时间...
我有两个 STAR 规则,STAR_genome 为 STAR 规则做了一些索引,因此 STAR 的输入是 STAR_genome 的直接输出 - 到目前为止很简单。但是当我尝试 运行 时,STAR_genome 规则被忽略(未在作业计数中列出)并且我得到以下异常:
FileNotFoundError: [Errno 2] No such file or directory: '[...]STAR/cauliflower/genome/genome.ok'
我不明白为什么 snakemake 会忽略生成规则而只是抱怨缺少文件,因为它甚至采用了应该生成它的规则的路径...
rule STAR_genome:
input: genome=lambda wildcards: config[wildcards.species]["genomefile"]
output: ok=path.join(STAR_DIR, "{species}", "genome", "genome.ok")
threads: 32
envmodules:
config["STAR"][0],
config["STAR"][1]
script:
"scripts/Trinity_GG/STAR_genome.py"
############################################################################
rule STAR:
input:
genome=rules.STAR_genome.output.ok,
r1=rules.trim_galore.output.r1,
r2=rules.trim_galore.output.r2
output:
bam=path.join(STAR_DIR, "{species}_{rep}_Aligned.sortedByCoord.out.bam")
threads: 32
envmodules:
config["STAR"][0],
config["STAR"][1]
script:
"scripts/Trinity_GG/STAR.py"
这里是完整的回溯,以防万一它能有所帮助。
Traceback (most recent call last):
File "/cluster/easybuild/broadwell/software/mflow/0.0-foss-2019b-Python-3.7.4/lib/python3.7/site-packages/snakemake-5.27.4-py3.7.egg/snakemake/__init__.py", line 751, in snakemake
keepmetadata=keep_metadata,
File "/cluster/easybuild/broadwell/software/mflow/0.0-foss-2019b-Python-3.7.4/lib/python3.7/site-packages/snakemake-5.27.4-py3.7.egg/snakemake/workflow.py", line 1000, in execute
success = scheduler.schedule()
File "/cluster/easybuild/broadwell/software/mflow/0.0-foss-2019b-Python-3.7.4/lib/python3.7/site-packages/snakemake-5.27.4-py3.7.egg/snakemake/scheduler.py", line 444, in schedule
run = self.job_selector(needrun)
File "/cluster/easybuild/broadwell/software/mflow/0.0-foss-2019b-Python-3.7.4/lib/python3.7/site-packages/snakemake-5.27.4-py3.7.egg/snakemake/scheduler.py", line 731, in job_selector_greedy
c = list(map(self.job_reward, jobs)) # job rewards
File "/cluster/easybuild/broadwell/software/mflow/0.0-foss-2019b-Python-3.7.4/lib/python3.7/site-packages/snakemake-5.27.4-py3.7.egg/snakemake/scheduler.py", line 814, in job_reward
input_size = job.inputsize
File "/cluster/easybuild/broadwell/software/mflow/0.0-foss-2019b-Python-3.7.4/lib/python3.7/site-packages/snakemake-5.27.4-py3.7.egg/snakemake/jobs.py", line 378, in inputsize
self._inputsize = sum(f.size for f in self.input)
File "/cluster/easybuild/broadwell/software/mflow/0.0-foss-2019b-Python-3.7.4/lib/python3.7/site-packages/snakemake-5.27.4-py3.7.egg/snakemake/jobs.py", line 378, in <genexpr>
self._inputsize = sum(f.size for f in self.input)
File "/cluster/easybuild/broadwell/software/mflow/0.0-foss-2019b-Python-3.7.4/lib/python3.7/site-packages/snakemake-5.27.4-py3.7.egg/snakemake/io.py", line 239, in wrapper
return func(self, *args, **kwargs)
File "/cluster/easybuild/broadwell/software/mflow/0.0-foss-2019b-Python-3.7.4/lib/python3.7/site-packages/snakemake-5.27.4-py3.7.egg/snakemake/io.py", line 254, in wrapper
return func(self, *args, **kwargs)
File "/cluster/easybuild/broadwell/software/mflow/0.0-foss-2019b-Python-3.7.4/lib/python3.7/site-packages/snakemake-5.27.4-py3.7.egg/snakemake/io.py", line 553, in size
return self.size_local
File "/cluster/easybuild/broadwell/software/mflow/0.0-foss-2019b-Python-3.7.4/lib/python3.7/site-packages/snakemake-5.27.4-py3.7.egg/snakemake/io.py", line 558, in size_local
self.check_broken_symlink()
File "/cluster/easybuild/broadwell/software/mflow/0.0-foss-2019b-Python-3.7.4/lib/python3.7/site-packages/snakemake-5.27.4-py3.7.egg/snakemake/io.py", line 563, in check_broken_symlink
if not self.exists_local and os.lstat(self.file):
FileNotFoundError: [Errno 2] No such file or directory: '[...]/Cauliflower_Test/STAR/cauliflower/genome/genome.ok'
在命令 snakemake --debug-dag 的帮助下,我发现我在
的配置调用中出错input: genome=lambda wildcards: config[wildcards.species]["genomefile"]
正确的名称是“genome_file”,但在任何地方都没有提到,但调试 dag 指出:
candidate job STAR
wildcards: species=cauliflower, rep=A
file [...]/Cauliflower_Test/Alignmentscores/cauliflower_align_rate.txt:
No producers found, but file is present on disk.
Error:
KeyError: 'genomefile'
Wildcards:
species=cauliflower
Traceback:
File "[...]/workflows/Trinity_GG.smk", line 9, in <lambda>
希望这对其他人有帮助,我花了几个小时和很多时间...