SQL:更新所有可能的组合

SQL: Update with all possible combinations

我有关系

+-----+----+
| seq | id |
+-----+----+
|   1 | A1 |
|   2 | B1 |
|   3 | C1 |
|   4 | D1 |
+-----+----+

并想在 PostgreSQL 中加入

+----+-------+
| id | alter |
+----+-------+
| B1 | B2    |
| D1 | D2    |
+----+-------+

所以我得到了所有可能的替换组合(即替换或多或少的笛卡尔积)。所以第 1 组没有更新,第 2 组只有 B2,第 3 组只有 D2,第 4 组有 B2 和 D2。

结尾应该是这样的,但应该开放更多(比如 D1 的额外 D3)

+-------+-----+----+
| group | seq | id |
+-------+-----+----+
|     1 |   1 | A1 |
|     1 |   2 | B1 |
|     1 |   3 | C1 |
|     1 |   4 | D1 |
|     2 |   1 | A1 |
|     2 |   2 | B2 |
|     2 |   3 | C1 |
|     2 |   4 | D1 |
|     3 |   1 | A1 |
|     3 |   2 | B1 |
|     3 |   3 | C1 |
|     3 |   4 | D2 |
|     4 |   1 | A1 |
|     4 |   2 | B2 |
|     4 |   3 | C1 |
|     4 |   4 | D2 |
+-------+-----+----+

编辑:

另一个可能的替代 table 可能是

+----+-------+
| id | alter |
+----+-------+
| B1 | B2    |
| D1 | D2    |
| D1 | D3    |
+----+-------+

可能会产生 6 个组(我希望我没有忘记一个案例)

+-------+-----+----+
| group | seq | id |
+-------+-----+----+
|     1 |   1 | A1 |
|     1 |   2 | B1 |
|     1 |   3 | C1 |
|     1 |   4 | D1 |
|     2 |   1 | A1 |
|     2 |   2 | B2 |
|     2 |   3 | C1 |
|     2 |   4 | D1 |
|     3 |   1 | A1 |
|     3 |   2 | B2 |
|     3 |   3 | C1 |
|     3 |   4 | D2 |
|     4 |   1 | A1 |
|     4 |   2 | B2 |
|     4 |   3 | C1 |
|     4 |   4 | D3 |
|     5 |   1 | A1 |
|     5 |   2 | B1 |
|     5 |   3 | C1 |
|     5 |   4 | D2 |
|     6 |   1 | A1 |
|     6 |   2 | B1 |
|     6 |   3 | C1 |
|     6 |   4 | D3 |
+-------+-----+----+

如果你有三个替代品,比如

+----+-------+
| id | alter |
+----+-------+
| B1 | B2    |
| C1 | C2    |
| D1 | D3    |
+----+-------+

它会产生 8 个组。 到目前为止我尝试的方法并没有真正帮助:


WITH a as (SELECT * FROM (values (1,'A1'),(2,'B1'), (3,'C1'), (4,'D1')   ) as a1(seq, id) )
, b as (SELECT * FROM (values ('B1','B2'), ('D1','D2')) as b1(id,alter) )
---------
SELECT row_number() OVER (PARTITION BY a.id) as g, * FROM 
a
CROSS JOIN  b as b1
CROSS JOIN  b as b2
LEFT JOIN b as b3 ON a.id=b3.id
ORDER by g,seq;

我很高兴对标题提出更好的建议。

我只能想到蛮力方法。枚举组并乘以第二个 table - 因此每个组都有一组行。

下面再使用位操作来选择哪个值:

WITH a as (
      SELECT * FROM (values (1,'A1'),(2,'B1'), (3,'C1'), (4,'D1')   ) as a1(seq, id)
      ),
     b as (
      SELECT * FROM (values ('B1','B2'), ('D1','D2')) as b1(id,alter)
     ),
     bgroups as (
      SELECT b.*, grp - 1 as grp, ROW_NUMBER() OVER (PARTITION BY grp ORDER BY id) - 1 as seqnum
      FROM b CROSS JOIN
           GENERATE_SERIES(1, (SELECT POWER(2, COUNT(*))::int FROM b)) gs(grp)
     )
SELECT bg.grp, a.seq, 
       COALESCE(MAX(CASE WHEN a.id = bg.id AND (POWER(2, bg.seqnum)::int & bg.grp) > 0 THEN bg.alter END),
                MAX(a.id)
               ) as id
FROM a CROSS JOIN
     bgroups bg
GROUP BY bg.grp, a.seq
ORDER BY bg.grp, a.seq;

Here 是一个 db<>fiddle.

So group 1 has no update,group 2 only B2, group 3 only D2 and group 4 both B2 and D2.

由于这个语句的逻辑不在table中,我决定把这个逻辑添加到tablec中,也就是在现有的tablea中增加3个新的列,具体取决于必须考虑的字段选择。

WITH a as (SELECT * FROM (values (1,'A1'),(2,'B1'), (3,'C1'), (4,'D1')   ) as a1(seq, id) )
, b as (SELECT * FROM (values ('B1','B2'), ('D1','D2')) as b1(id,alter) )
, c as (
SELECT a.seq, a.id,
COALESCE(b1.alter,a.id) as id2,
COALESCE(b2.alter,a.id) as id3,
COALESCE(b3.alter,a.id) as id4
FROM a
LEFT JOIN (SELECT * FROM b WHERE b.alter='B2') b1 ON a.id = b1.id
LEFT JOIN (SELECT * FROM b WHERE b.alter='D2') b2 ON a.id = b2.id
LEFT JOIN (SELECT * FROM b WHERE b.alter IN ('B2','D2')) b3 ON a.id = b3.id)
, d as (SELECT * FROM (values (1),(2), (3), (4)   ) as d1(gr) )



SELECT d.gr,
CASE d.gr
   WHEN 1 THEN c.id
   WHEN 2 THEN c.id2
   WHEN 3 THEN c.id3
   WHEN 4 THEN c.id4 END as id

FROM d
CROSS JOIN  c
ORDER by d.gr, c.seq

你需要什么

根据您的评论提供更多信息后,您的情况似乎是这样:

您有给定停车位数量的收费站:

CREATE TABLE station (
  station text PRIMARY KEY
, booths  int NOT NULL  -- number of cashiers in station
);
INSERT INTO station VALUES 
  ('A', 1)
, ('B', 2)
, ('C', 1)
, ('D', 3);

对于给定的路线,例如 A --> B --> C --> D 您想要生成所有可能的路径,并考虑展位号。我建议使用 SQL 函数和 recursive CTE 像:

CREATE OR REPLACE FUNCTION f_pathfinder(_route text[])
  RETURNS TABLE (grp int, path text[]) LANGUAGE sql STABLE PARALLEL SAFE AS
$func$
WITH RECURSIVE rcte AS (
   SELECT cardinality() AS hops, 1 AS hop, ARRAY[s.station || booth] AS path
   FROM   station s, generate_series(1, s.booths) booth
   WHERE  s.station = [1]

   UNION ALL
   SELECT r.hops, r.hop + 1, r.path || (s.station || booth)
   FROM   rcte  r
   JOIN   station s ON s.station = _route[r.hop + 1], generate_series(1, s.booths) booth
   WHERE  r.hop < r.hops
   )
SELECT row_number() OVER ()::int AS grp, path
FROM   rcte r
WHERE  r.hop = r.hops;
$func$;

简单调用:

SELECT * FROM f_pathfinder('{A,B,C,D}'::text[]);

结果:

 grp | path
---: | :--------
   1 | {1,1,1,1}
   2 | {1,1,1,2}
   3 | {1,1,1,3}
   4 | {1,2,1,1}
   5 | {1,2,1,2}
   6 | {1,2,1,3}

或使用非嵌套数组(如您所显示的结果):

SELECT grp, seq, booth
FROM   f_pathfinder('{A,B,C,D}'::text[])
     , unnest(path) WITH ORDINALITY AS x(booth, seq);  -- ①

结果:

grp | seq | booth
--: | --: | :----
  1 |   1 | A1   
  1 |   2 | B1   
  1 |   3 | C1   
  1 |   4 | D1   
  2 |   1 | A1   
  2 |   2 | B1   
  2 |   3 | C1   
  2 |   4 | D2   
  3 |   1 | A1   
  3 |   2 | B1   
  3 |   3 | C1   
  3 |   4 | D3   
  4 |   1 | A1   
  4 |   2 | B2   
  4 |   3 | C1   
  4 |   4 | D1   
  5 |   1 | A1   
  5 |   2 | B2   
  5 |   3 | C1   
  5 |   4 | D2   
  6 |   1 | A1   
  6 |   2 | B2   
  6 |   3 | C1   
  6 |   4 | D3   

db<>fiddle here

变体的数量随着您路线中停靠站的数量而快速增长M1*M2* .. *Mn 其中Mn为第n站的展位数量

① 关于ORDINALITY:

  • PostgreSQL unnest() with element number

你问的(原文)

似乎您想将替换 table rpl 中列出的更改集中的所有可能组合 应用到目标 table tbl.

只需两行,形成 4 (2^n) 种可能的组合很简单。对于一般解决方案,我建议使用基本组合函数来生成所有组合。有无数种方法。这是一个 纯 SQL 函数:

CREATE OR REPLACE FUNCTION f_allcombos(_len int)
  RETURNS SETOF bool[] LANGUAGE sql IMMUTABLE PARALLEL SAFE AS
$func$
WITH RECURSIVE
   tf(b) AS (VALUES (false), (true))

 , rcte AS (
   SELECT 1 AS lvl, ARRAY[b] AS arr
   FROM   tf

   UNION ALL
   SELECT r.lvl + 1, r.arr || tf.b
   FROM   rcte r, tf 
   WHERE  lvl < _len
   )
SELECT arr
FROM   rcte
WHERE  lvl = _len;
$func$;

类似于此处讨论的内容:

仅 2 个替换行的示例:

SELECT * FROM f_allcombos(2);
{f,f}
{t,f}
{f,t}
{t,t}

查询

WITH effective_rpl AS (  -- ①
   SELECT *, count(alter) OVER (ORDER BY seq) AS idx  -- ②
   FROM   tbl LEFT JOIN rpl USING (id)
   )
SELECT c.grp, e.seq
     , CASE WHEN alter IS NOT NULL AND c.arr[e.idx] THEN e.alter  -- ③
            ELSE e.id END AS id
FROM   effective_rpl e
     , f_allcombos((SELECT count(alter)::int FROM effective_rpl))  -- ④
          WITH ORDINALITY AS c(arr, grp); -- ⑤

准确地产生您想要的结果。

db<>fiddle here

① 部分替换可能在目标中没有匹配项table;所以首先要确定有效的替代品。

count() 只计算非空值,因此可以作为从 f_allcombos().

返回的从 1 开始的数组的索引

③ 仅在替换可用时替换,并且布尔数组具有给定索引 idx.

true

CROSS JOIN 将目标中的行集合 table 乘以可能的替换组合数

⑤我用WITH ORDINALITY生成"group numbers"。参见:

  • PostgreSQL unnest() with element number

我们可以将其直接连接到函数中,但我宁愿保持通用。

旁白:"alter" 在 Postgres 中是非保留的,但在标准 SQL.

中是 reserved word

答案在编辑问题后更新

这个问题中棘手的部分是生成替换的幂集。然而,幸运的是 postgres 支持递归查询和幂集可以递归计算。因此,我们可以为这个问题构建一个通用的解决方案,无论替换集的大小如何,它都可以工作。

让我们调用第一个 table source,第二个 table replacements,我会避免使用令人讨厌的名称 alter 来代替其他名称:

CREATE TABLE source (seq, id) as (
  VALUES (1, 'A1'), (2, 'B1'), (3, 'C1'), (4, 'D1')
);
CREATE TABLE replacements (id, sub) as (
  VALUES ('B1', 'B2'), ('D1', 'D2')
);

需要生成要替换的 ID 的第一个幂集。可以省略空集,因为它无论如何都不能与连接一起使用,最后 source table 可以 union 到中间结果以提供相同的输出。

在递归步骤中,JOIN 条件 rec.id > repl.id 确保每个 id 对于每个生成的子集只出现一次。

最后一步:

交叉连接扇出源N次,其中N是替换的非空组合的数量(有变化)

组名称是使用 seq 上的过滤运行总和生成的。

子集未嵌套,如果替换 ID 等于源 ID,则使用合并替换 ID。

WITH RECURSIVE rec AS (
  SELECT ARRAY[(id, sub)] subset, id FROM replacements
  UNION ALL
  SELECT subset || (repl.id, sub), repl.id 
  FROM replacements repl 
  JOIN rec ON rec.id > repl.id
)
SELECT NULL subset, 0 set_name, seq, id FROM source
UNION ALL
SELECT subset
, SUM(seq) FILTER (WHERE seq = 1) OVER (ORDER BY subset, seq) set_name 
, seq
, COALESCE(sub, source.id) id
FROM rec 
CROSS JOIN source
LEFT JOIN LATERAL (
  SELECT id, sub 
  FROM unnest(subset) x(id TEXT, sub TEXT)
  ) x ON source.id = x.id;

测试

使用替换值('B1', 'B2'), ('D1', 'D2'),查询returns 4组。

        subset         | set_name | seq | id 
-----------------------+----------+-----+----
                       |        0 |   1 | A1
                       |        0 |   2 | B1
                       |        0 |   3 | C1
                       |        0 |   4 | D1
 {"(B1,B2)"}           |        1 |   1 | A1
 {"(B1,B2)"}           |        1 |   2 | B2
 {"(B1,B2)"}           |        1 |   3 | C1
 {"(B1,B2)"}           |        1 |   4 | D1
 {"(D1,D2)"}           |        2 |   1 | A1
 {"(D1,D2)"}           |        2 |   2 | B1
 {"(D1,D2)"}           |        2 |   3 | C1
 {"(D1,D2)"}           |        2 |   4 | D2
 {"(D1,D2)","(B1,B2)"} |        3 |   1 | A1
 {"(D1,D2)","(B1,B2)"} |        3 |   2 | B2
 {"(D1,D2)","(B1,B2)"} |        3 |   3 | C1
 {"(D1,D2)","(B1,B2)"} |        3 |   4 | D2
(16 rows)

有替换值('B1', 'B2'), ('D1', 'D2'), ('D1', 'D3'),查询returns6组:

        subset         | set_name | seq | id 
-----------------------+----------+-----+----
                       |        0 |   1 | A1
                       |        0 |   2 | B1
                       |        0 |   3 | C1
                       |        0 |   4 | D1
 {"(B1,B2)"}           |        1 |   1 | A1
 {"(B1,B2)"}           |        1 |   2 | B2
 {"(B1,B2)"}           |        1 |   3 | C1
 {"(B1,B2)"}           |        1 |   4 | D1
 {"(D1,D2)"}           |        2 |   1 | A1
 {"(D1,D2)"}           |        2 |   2 | B1
 {"(D1,D2)"}           |        2 |   3 | C1
 {"(D1,D2)"}           |        2 |   4 | D2
 {"(D1,D2)","(B1,B2)"} |        3 |   1 | A1
 {"(D1,D2)","(B1,B2)"} |        3 |   2 | B2
 {"(D1,D2)","(B1,B2)"} |        3 |   3 | C1
 {"(D1,D2)","(B1,B2)"} |        3 |   4 | D2
 {"(D1,D3)"}           |        4 |   1 | A1
 {"(D1,D3)"}           |        4 |   2 | B1
 {"(D1,D3)"}           |        4 |   3 | C1
 {"(D1,D3)"}           |        4 |   4 | D3
 {"(D1,D3)","(B1,B2)"} |        5 |   1 | A1
 {"(D1,D3)","(B1,B2)"} |        5 |   2 | B2
 {"(D1,D3)","(B1,B2)"} |        5 |   3 | C1
 {"(D1,D3)","(B1,B2)"} |        5 |   4 | D3
(24 rows)

用替换值('B1', 'B2'), ('C1', 'C2'), ('D1', 'D2'),查询returns8组:

             subset              | set_name | seq | id 
---------------------------------+----------+-----+----
                                 |        0 |   1 | A1
                                 |        0 |   2 | B1
                                 |        0 |   3 | C1
                                 |        0 |   4 | D1
 {"(B1,B2)"}                     |        1 |   1 | A1
 {"(B1,B2)"}                     |        1 |   2 | B2
 {"(B1,B2)"}                     |        1 |   3 | C1
 {"(B1,B2)"}                     |        1 |   4 | D1
 {"(C1,C2)"}                     |        2 |   1 | A1
 {"(C1,C2)"}                     |        2 |   2 | B1
 {"(C1,C2)"}                     |        2 |   3 | C2
 {"(C1,C2)"}                     |        2 |   4 | D1
 {"(C1,C2)","(B1,B2)"}           |        3 |   1 | A1
 {"(C1,C2)","(B1,B2)"}           |        3 |   2 | B2
 {"(C1,C2)","(B1,B2)"}           |        3 |   3 | C2
 {"(C1,C2)","(B1,B2)"}           |        3 |   4 | D1
 {"(D1,D2)"}                     |        4 |   1 | A1
 {"(D1,D2)"}                     |        4 |   2 | B1
 {"(D1,D2)"}                     |        4 |   3 | C1
 {"(D1,D2)"}                     |        4 |   4 | D2
 {"(D1,D2)","(B1,B2)"}           |        5 |   1 | A1
 {"(D1,D2)","(B1,B2)"}           |        5 |   2 | B2
 {"(D1,D2)","(B1,B2)"}           |        5 |   3 | C1
 {"(D1,D2)","(B1,B2)"}           |        5 |   4 | D2
 {"(D1,D2)","(C1,C2)"}           |        6 |   1 | A1
 {"(D1,D2)","(C1,C2)"}           |        6 |   2 | B1
 {"(D1,D2)","(C1,C2)"}           |        6 |   3 | C2
 {"(D1,D2)","(C1,C2)"}           |        6 |   4 | D2
 {"(D1,D2)","(C1,C2)","(B1,B2)"} |        7 |   1 | A1
 {"(D1,D2)","(C1,C2)","(B1,B2)"} |        7 |   2 | B2
 {"(D1,D2)","(C1,C2)","(B1,B2)"} |        7 |   3 | C2
 {"(D1,D2)","(C1,C2)","(B1,B2)"} |        7 |   4 | D2
(32 rows)