查询当前 table 时，Postgresql 插入触发器变慢

Question

将批号插入 table 时，我们会计算基数存在的次数，并根据该计数在新编号的末尾添加 -##。

我已经去掉了大部分逻辑（我们也会检查其他内容）。我也知道这里会跳过 -1 的逻辑缺陷。

-- Function: stone._lsuniqueid()

-- DROP FUNCTION stone._lsuniqueid();

CREATE OR REPLACE FUNCTION stone._lsuniqueid()
  RETURNS trigger AS
$BODY$
DECLARE
 _count INTEGER;

BEGIN
  -- Obtain the number of occurences of this new ls_number
  SELECT COUNT(ls_number) into _count
  FROM ls
  WHERE ls_number LIKE CAST(NEW.ls_number || '%' AS text);

  -- Allow new ls_numbers to be entered as is, otherwise add "-#{count + 1}"
  -- to the end of the ls_number
  if _count > 0 THEN
    NEW.ls_number = NEW.ls_number || '-' || CAST(_count + 1 AS text);
  END IF;      

  RETURN NEW;
END
$BODY$

INSERT INTO ls VALUES (NEXTVAL('ls_ls_id_seq'),7285,UPPER('20151012'));
--> Query returned successfully: one row affected, 391 ms execution time.

计数查询非常快

SELECT COUNT(ls_number)
FROM ls
WHERE ls_number LIKE CAST('20151012' || '%' AS text);
--> 19ms

为了比较，我尝试了一个类似的触发器，但是运行对具有相同行数和相似查询时间的不同 table 的计数。

SELECT COUNT(lsdetail_id)
FROM lsdetail
WHERE lsdetail_id > 2433308
--> 20ms

运行相同的插入与计数运行对不同的 table returns 结果快 20 倍。

INSERT INTO ls VALUES (NEXTVAL('ls_ls_id_seq'),7285,UPPER('20151012'));
 --> Query returned successfully: one row affected, 20 ms execution time.

ls table 有大约 250 万行

我尝试了几种不同的方法，问题似乎是从我插入的同一个 table 中进行选择时出现的。

我想知道为什么会这样，但我也愿意接受更好的方法来创建 "sub-lot" 数字。

谢谢！

Answer 1

在这里找到答案： http://www.postgresql.org/message-id/27705.1150381444@sss.pgh.pa.us

回复：如何分析函数性能

"Mindaugas" writes:

Is it possible to somehow analyze function performance? E.g. we are using function cleanup() which takes obviously too much time to execute but I have problems trying to figure what is slowing things down.

When I explain analyze function lines step by step it show quite acceptable performance.

--

Are you sure you are "explain analyze"ing the same queries the function is really doing? You have to account for the fact that what plpgsql is issuing is parameterized queries, and sometimes that limits the planner's ability to pick a good plan. For instance, if you have

declare x int;
begin
    ...
    for r in select * from foo where key = x loop ...

then what is really getting planned and executed is "select * from foo where key = " --- every plpgsql variable gets replaced by a parameter symbol "$n". You can model this for EXPLAIN purposes with a prepared statement:

prepare p1(int) as select * from foo where key = ;
explain analyze execute p1(42);

If you find out that a particular query really sucks when parameterized, you can work around this by using EXECUTE to force the query to be planned afresh on each use with literal constants instead of parameters:

然后我调查了这个： http://www.postgresql.org/docs/9.1/static/plpgsql-statements.html#PLPGSQL-STATEMENTS-EXECUTING-DYN

39.5.4. Executing Dynamic Commands

Oftentimes you will want to generate dynamic commands inside your PL/pgSQL functions, that is, commands that will involve different tables or different data types each time they are executed. PL/pgSQL's normal attempts to cache plans for commands (as discussed in Section 39.10.2) will not work in such scenarios. To handle this sort of problem, the EXECUTE statement is provided:

EXECUTE 'SELECT count(*) FROM mytable WHERE inserted_by =  AND inserted <= '
INTO c
USING checked_user, checked_date;

--

所以最后是将计数 select 更新为：

 EXECUTE 'SELECT COALESCE(COUNT(ls_number), 0) FROM ls WHERE ls_number LIKE  || ''%'';'
 INTO _count
 USING NEW.ls_number;

查询当前 table 时，Postgresql 插入触发器变慢

Postgresql insert trigger becomes slow when querying current table

sql

database

postgresql

plpgsql