'InputFiles' 使用函数作为 snakemake 规则的输入时，对象没有属性 <X>

Question

我有一个 snakemake 工作流程，其中一些规则具有复杂的功能作为输入：

def source_fold_data(wildcards):
    fold_type = wildcards.fold_type
    if fold_type in {"log2FoldChange", "lfcMLE"}:
        if hasattr(wildcards, "contrast_type"):
            # OPJ is os.path.join
            return expand(
                OPJ(output_dir, aligner, "mapped_C_elegans",
                    "deseq2_%s" % size_selected, "{contrast}",
                    "{contrast}_{{small_type}}_counts_and_res.txt"),
                contrast=contrasts_dict[wildcards.contrast_type])
        else:
            return rules.small_RNA_differential_expression.output.counts_and_res
    elif fold_type == "mean_log2_RPKM_fold":
        if hasattr(wildcards, "contrast_type"):
            # This is the branch used when I have the AttributeError
            #
            return [filename.format(wildcards) for filename in expand(
                OPJ(output_dir, aligner, "mapped_C_elegans",
                    "RPKM_folds_%s" % size_selected, "{contrast}",
                    "{contrast}_{{0.small_type}}_RPKM_folds.txt"),
                contrast=contrasts_dict[wildcards.contrast_type])]
        else:
            return rules.compute_RPKM_folds.output.fold_results
    else:
        raise NotImplementedError("Unknown fold type: %s" % fold_type)

以上函数用作两个规则的输入：

rule make_gene_list_lfc_boxplots:
    input:
        data = source_fold_data,
    output:
        boxplots = OPJ(output_dir, "figures", "{contrast}",
            "{contrast}_{small_type}_{fold_type}_{gene_list}_boxplots.{fig_format}")
    params:
        id_lists = set_id_lists,
    run:
        data = pd.read_table(input.data, index_col="gene")
        lfcs = pd.DataFrame(
            {list_name : data.loc[set(id_list)][wildcards.fold_type] for (
                list_name, id_list) in params.id_lists.items()})
        save_plot(output.boxplots, plot_boxplots, lfcs, wildcards.fold_type)


rule make_contrast_lfc_boxplots:
    input:
        data = source_fold_data,
    output:
        boxplots = OPJ(output_dir, "figures", "all_{contrast_type}",
            "{contrast_type}_{small_type}_{fold_type}_{gene_list}_boxplots.{fig_format}")
    params:
        id_lists = set_id_lists,
    run:
        lfcs = pd.DataFrame(
            {f"{contrast}_{list_name}" : pd.read_table(filename, index_col="gene").loc[
                set(id_list)]["mean_log2_RPKM_fold"] for (
                    contrast, filename) in zip(contrasts_dict["ip"], input.data) for (
                        list_name, id_list) in params.id_lists.items()})
        save_plot(output.boxplots, plot_boxplots, lfcs, wildcards.fold_type)

第二个失败 'InputFiles' object has no attribute 'data'，并且仅在某些情况下：我运行具有两个不同配置文件的相同工作流程，并且错误仅发生在两个中的一个中，尽管这在这两种情况下都执行了规则，并采用了输入函数的相同 b运行ch。

如果规则有：

，怎么会发生这种情况？

    input:
        data = ...

?

我想这与我的 source_fold_data returns 有关，要么是另一个规则的显式输出，要么是 "manually" 构造的文件名列表。

Answer 1

正如评论中 @Colin 所建议的那样，当输入函数 returns 为空列表时会出现问题。当 contrasts_dict[wildcards.contrast_type] 是一个空列表时就是这种情况，这种情况表明尝试生成规则 make_contrast_lfc_boxplots 的输出实际上没有意义。我通过如下修改规则 all 的输入部分来避免这种情况：

旧版本：

rule all:
    input:
        # [...]
        expand(OPJ(output_dir, "figures", "all_{contrast_type}", "{contrast_type}_{small_type}_{fold_type}_{gene_list}_boxplots.{fig_format}"), contrast_type=["ip"], small_type=IP_TYPES, fold_type=["mean_log2_RPKM_fold"], gene_list=BOXPLOT_GENE_LISTS, fig_format=FIG_FORMATS),
        # [...]

新版本：

if contrasts_dict["ip"]:
    ip_fold_boxplots = expand(OPJ(output_dir, "figures", "all_{contrast_type}", "{contrast_type}_{small_type}_{fold_type}_{gene_list}_boxplots.{fig_format}"), contrast_type=["ip"], small_type=IP_TYPES, fold_type=["mean_log2_RPKM_fold"], gene_list=BOXPLOT_GENE_LISTS, fig_format=FIG_FORMATS)
else:
    ip_fold_boxplots = []
rule all:
    input:
        # [...]
        ip_fold_boxplots,
        # [...]

对 snakemake/rules.py 的一些修改表明，在某些时候，data 属性存在于名为 make_contrast_lfc_boxplots 的 Rule 对象的 input 属性中，并且该属性仍然是 source_fold_data 函数。我想当它是一个空列表时，稍后会对其进行评估和删除，但我一直无法找到位置。

我想在snakemake构建规则之间的依赖图时空输入不是问题。因此，该问题仅在执行规则期间出现。

'InputFiles' 使用函数作为 snakemake 规则的输入时，对象没有属性 <X>

'InputFiles' object has no attribute <X> when using a function as input for a snakemake rule

python

snakemake