什么是 snakemake 元数据文件?我什么时候可以删除那些?

What are snakemake metadata files? When can I erase those?

我注意到我的备份 rsync 脚本花费了相当长的时间从 .snakemake/metadata 个文件夹中复制具有随机名称的内容。

这些文件有什么用?

我能否在 snakemake 运行 完成后安全地删除它们,或者它们是否是 snakemake 正确执行下一个 运行 所必需的?

更一般地说,是否有一些关于 snakemake 在 .snakemake 文件夹中创建的文件的文档?

来自 this comment by Johannes Koster,Snakemake 的创造者:

[The .snakemake/ directory] is used to track (a) the value of the version keyword for each file, (b) the rule implementation for each file, in order to notify the user if something has changed when snakemake is invoked with --summary.

来自 Google 组的相关 comment

In general, it is safe to delete the entire .snakemake directory if there is no running Snakemake instance and you are sure that all existing output files are complete. It only contains data provenance information (e.g., to track code input file or parameter changes [to determine if the workflow should be re-run]). You might want to keep .snakemake/conda, since it contains the conda environments used in your workflow.

编辑:要在管道成功执行后自动删除 .snakemake/ 目录,可以使用 onssuccess 挂钩:

import shutil
onsuccess:
    shutil.rmtree(".snakemake")

现在是老问题,并没有真正回答它...既然你提到了 rsync,你可以使用 --exclude 选项跳过 .snakemake 目录,例如:

rsync ... --exclude='.snakemake' source/ dest/