引用要替换的增量模型分区 - dbt v0.17.2

Question

我正在使用 insert_overwrite 策略为 BigQuery 编写增量模型并尝试使用变量设置 partitions_to_replace：

{% set partitions_to_replace = [var('execution_date')] %}

只是为了测试编译，我在我的 dbt_project.yml 中使用一个看起来像 execution_date: '2020-01-01' 的变量进行编译。但是，似乎 在具体化生成的合并语句中，没有引用日期 ，因此失败并显示错误 No matching signature for operator IN for argument types DATE and {INT64}。这是生成的 SQL:

的相关片段

when not matched by source
         and DBT_INTERNAL_DEST.visit_date in (
              2020-01-01
          )

有没有办法确保变量周围的引号？在我编写的 SQL 中使用变量时，我知道我可以将 var 函数用引号引起来，但在这种情况下 SQL 是由物化生成的。

Answer 1

这是一个公平的问题。为了灵活性，具体化不会尝试将 partitions 值括在引号中，作为支持 SQL 表达式和文字作为潜在输入的一种方式。

即您可能希望 merge 谓词为：

when not matched by source
         and DBT_INTERNAL_DEST.visit_date in (
              '2020-01-01'
          )

但您可能同样希望它是：

    when not matched by source
         and DBT_INTERNAL_DEST.visit_date in (
              date_sub(current_date, interval 1 day)
          )

因此，您需要：

通过用双引号引起来将字符串文字传递到您的 var 中：

vars:
  execution_date: "'2020-01-01'"

或处理您的 set 语句中的附加引号，按照以下行：

{% set partitions_to_replace = [] %}
{% for execution_date in [var('execution_date')] %}
    {% set ex_date %} '{{ execution_date }}' {% endset %}
    {% do partitions_to_replace.append(ex_date) %}
{% endfor %}

查看 this related issue。 OP 对我们可以添加的语法提出了一些建议，以使其更直接；我很想知道其中哪些对您有意义。

Answer 2

我在通过 cli 传递变量时遇到了同样的问题。即：

dbt run --vars "execution_date: 2021-09-28"

我的解决方法如下：

设置一个partitions_to_replace变量。

{%- set partitions_to_replace -%} '{{var("execution_date")}}' {%- endset -%}

然后在我的 config 中，我只是将这个变量放在 list

partitions = [partitions_to_replace]

最后，要在 SQL 模型中渲染变量，只需像这样调用它：

partition_date = {{partitions_to_replace}}

引用要替换的增量模型分区 - dbt v0.17.2

Quoting for incremental model partitions to replace - dbt v0.17.2

sql

jinja2

dbt