如何在 postgresql 动态 SQL 中引用变量?

How can I refer to a variable in postgresql dynamic SQL?

我正在尝试为 table 更新插入编写一个 PostgreSQL 函数,该函数可用于任何 table。我的出发点取自特定 table 类型的具体函数:

CREATE TABLE doodad(id BIGINT PRIMARY KEY, data JSON);
CREATE OR REPLACE FUNCTION upsert_doodad(d doodad) RETURNS VOID AS
  $BODY$
BEGIN
  LOOP
    UPDATE doodad
       SET id = (d).id, data = (d).data
     WHERE id = (d).id;
    IF found THEN
      RETURN;
    END IF;

    -- does not exist, or was just deleted.

    BEGIN
      INSERT INTO doodad SELECT d.*;
      RETURN;
    EXCEPTION when UNIQUE_VIOLATION THEN
      -- do nothing, and loop to try the update again
    END;

  END LOOP;
END;
  $BODY$
LANGUAGE plpgsql;

我想出的任何 table 的动态 SQL 版本在这里: SQL Fiddle

CREATE OR REPLACE FUNCTION upsert(target ANYELEMENT) RETURNS VOID AS
$$
DECLARE
  attr_name NAME;
  col TEXT;
  selectors TEXT[];
  setters TEXT[];
  update_stmt TEXT;
  insert_stmt TEXT;
BEGIN
  FOR attr_name IN SELECT a.attname
                     FROM pg_index i
                     JOIN pg_attribute a ON a.attrelid = i.indrelid 
                                        AND a.attnum = ANY(i.indkey)
                    WHERE i.indrelid = format_type(pg_typeof(target), NULL)::regclass
                      AND i.indisprimary
  LOOP
    selectors := array_append(selectors, format('%1$s = target.%1$s', attr_name));
  END LOOP;

  FOR col IN SELECT json_object_keys(row_to_json(target))
  LOOP
    setters := array_append(setters, format('%1$s = (target).%1$s', col)); 
  END LOOP;

  update_stmt := format(
    'UPDATE %s SET %s WHERE %s',
    pg_typeof(target),
    array_to_string(setters, ', '),
    array_to_string(selectors, ' AND ')
  );
  insert_stmt := format('INSERT INTO %s SELECT (target).*', pg_typeof(target));

  LOOP
    EXECUTE update_stmt; 
    IF found THEN
      RETURN;
    END IF;

    BEGIN
      EXECUTE insert_stmt;
      RETURN;
    EXCEPTION when UNIQUE_VIOLATION THEN
      -- do nothing
    END;
  END LOOP;
END;
$$
LANGUAGE plpgsql;

当我尝试使用此功能时,出现错误:

SELECT * FROM upsert(ROW(1,'{}')::doodad);

ERROR: column "target" does not exist: SELECT * FROM upsert(ROW(1,'{}')::doodad)

我尝试更改 upsert 语句以使用占位符,但我不知道如何使用记录调用它:

EXECUTE update_stmt USING target;

ERROR: there is no parameter : SELECT * FROM upsert(ROW(1,'{}')::doodad)

EXECUTE update_stmt USING target.*;

ERROR: query "SELECT target.*" returned 2 columns: SELECT * FROM upsert(ROW(1,'{}')::doodad)

我觉得非常接近解决方案,但我无法弄清楚语法问题。

简答:不能。

Variable substitution does not happen in the command string given to EXECUTE or one of its variants. If you need to insert a varying value into such a command, do so as part of constructing the string value, or use USING, as illustrated in Section 40.5.4. 1

更长的答案:

SQL statements and expressions within a PL/pgSQL function can refer to variables and parameters of the function. Behind the scenes, PL/pgSQL substitutes query parameters for such references. 2

这是拼图的第一个重要部分:PL/pgSQL 对函数参数进行魔术转换,将它们变成变量替换。

二是变量替换字段可以引用:

Parameters to a function can be composite types (complete table rows). In that case, the corresponding identifier $n will be a row variable, and fields can be selected from it, for example .user_id. 3

这段摘录让我感到困惑,因为它指的是函数参数,但知道函数参数在幕后是作为变量替换实现的,看来我应该能够在 EXECUTE 中使用相同的语法。

这两个事实解锁了解决方案:在 USING 子句中使用 ROW 变量,并在动态 SQL 中取消引用它的字段。结果(SQL Fiddle):

CREATE OR REPLACE FUNCTION upsert(v_target ANYELEMENT)
  RETURNS SETOF ANYELEMENT AS
$$
DECLARE
  v_target_name TEXT;
  v_attr_name NAME;
  v_selectors TEXT[];
  v_colname TEXT;
  v_setters TEXT[];
  v_update_stmt TEXT;
  v_insert_stmt TEXT;
  v_temp RECORD;
BEGIN
  v_target_name := format_type(pg_typeof(v_target), NULL);

  FOR v_attr_name IN SELECT a.attname
                     FROM pg_index i
                     JOIN pg_attribute a ON a.attrelid = i.indrelid 
                                        AND a.attnum = ANY(i.indkey)
                    WHERE i.indrelid = v_target_name::regclass
                      AND i.indisprimary
  LOOP
    v_selectors := array_append(v_selectors, format('t.%1$I = .%1$I', v_attr_name));
  END LOOP;

  FOR v_colname IN SELECT json_object_keys(row_to_json(v_target))
  LOOP
    v_setters := array_append(v_setters, format('%1$I = .%1$I', v_colname));
  END LOOP;

  v_update_stmt := format(
      'UPDATE %I t SET %s WHERE %s RETURNING t.*',
      v_target_name,
      array_to_string(v_setters, ','),
      array_to_string(v_selectors, ' AND ')
  );

  v_insert_stmt = format('INSERT INTO %I SELECT .*', v_target_name);

  LOOP
    EXECUTE v_update_stmt INTO v_temp USING v_target;
    IF v_temp IS NOT NULL THEN
      EXIT;
    END IF;

    BEGIN
      EXECUTE v_insert_stmt USING v_target;
      EXIT;
    EXCEPTION when UNIQUE_VIOLATION THEN
      -- do nothing
    END;
  END LOOP;
  RETURN QUERY SELECT v_target.*;
END;
$$
LANGUAGE plpgsql;

对于可写的 CTE 爱好者,这可以简单地转换为 CTE 形式:

v_cte_stmt = format(
    'WITH up as (%s) %s WHERE NOT EXISTS (SELECT 1 from up t WHERE %s)',
    v_update_stmt,
    v_insert_stmt,
    array_to_string(v_selectors, ' AND '));

LOOP
  BEGIN
    EXECUTE v_cte_stmt USING v_target;
    EXIT;
  EXCEPTION when UNIQUE_VIOLATION THEN
    -- do nothing
  END;
END LOOP;
RETURN QUERY SELECT v_target.*;

注意:我已经对这个解决方案进行了零性能测试,我依赖于其他人的分析来确定其正确性。现在它似乎在我的开发环境中的 PostgreSQL 9.3 上 运行 正确。 YMMV.