如何写出高效的UPDATE-SELECTsql

How to write efficient UPDATE-SELECT sql

我为拥有大约 50.000.000 用户的 table 写了一个 sql。查询花费的时间比我预期的要多,大约 23 小时没有完成。

UPDATE users
    SET building_id = B.id
    FROM (
      SELECT *
      FROM buildings B
    ) AS B
    WHERE B.city          = address_city
      AND B.town          = address_town
      AND B.neighbourhood = address_neighbourhood
      AND B.street        = address_street
      AND B.no            = address_building_no

这个 sql 的想法是从用户那里删除 building/address 信息,而不是将其引用到建筑物 table。

EXPLAIN

Update on users  (cost=22226900.43..22548054.14 rows=15212 width=166) 
->  Merge Join  (cost=22226900.43..22548054.14 rows=15212 width=166)
         Merge Cond: (((users.address_city)::text = (b.city)::text) AND ((users.address_town)::text = (b.town)::text) AND ((users.address_neighbourhood)::text = (b.neighbourhood)::text) AND ((users.address_street)::text = (b.street)::text) AND ((users.address_building_no)::text = (b.no)::text))
         ->  Sort  (cost=21352886.76..21401078.96 rows=96384398 width=156)
               Sort Key: users.address_city, users.address_town, users.address_neighbourhood, users.address_street, users.address_building_no
               ->  Seq Scan on users  (cost=0.00..2559921.19 rows=96384398 width=156)
         ->  Materialize  (cost=874013.68..883606.86 rows=9593179 width=63)
               ->  Sort  (cost=874013.68..878810.27 rows=9593179 width=63)
                     Sort Key: b.city, b.town, b.neighbourhood, b.street, b.no
                     ->  Seq Scan on buildings b  (cost=0.00..136253.54 rows=9593179 width=63) (10 rows)

我不知道这个 sql 是否为每个用户或缓存使用内部 SELECT sql 进行交易。另外,如果它缓存,它是否使用缓存临时文件的索引table?

我不能这样写 sql:

FROM (
  SELECT * 
  FROM buildings B
  WHERE B.city          = users.address_city
    AND B.town          = users.address_town
    AND B.neighbourhood = users.address_neighbourhood
    AND B.street        = users.address_street
    AND B.no            = users.address_building_no
  )

它说无法从内部 select 访问 users。您对如何访问内部 sql 语句中的建筑物有什么建议吗?

不确定,但这不会更快(至少稍微快一点,如果不是相当大的话)吗?

UPDATE users
SET building_id = B.id
FROM buildings B
WHERE B.city          = address_city
  AND B.town          = address_town
  AND B.neighbourhood = address_neighbourhood
  AND B.street        = address_street
  AND B.no            = address_building_no

如果不出意外,它不需要上面 EXPLAIN 中给出的 Materialize 阶段。

我想

create table t as select column_list from a join b on column=column;
alter table t rename to users;

会更快,并且只会产生微秒级锁定... 当然,如果 table 目前不是 editable 并且 temp_tablespace

中有足够的 space