将具有逗号分隔值的 Oracle 行扩展为多行

Expanding Oracle rows with comma-delimited values into multiple rows

我在 Oracle 中有一个 table,如下所示:

KEY,VALS
k1,"a,b"

我需要它看起来像:

KEY,VAL
k1,a
k1,b

我用 CONNECT BYLEVEL 做了这个,然后 example:

with t as (
    select 'k1' as key, 'a,b' as vals
    from dual
)
select key, regexp_substr(vals, '[^,]+', 1, level) as val
from t
connect by LEVEL <= length(vals) - length(replace(vals, ',')) + 1

但是当我在 table 中有多行时,vals 可以是不同长度的逗号分隔值,例如:

KEY,VALS
k1,"a,b"
k2,"c,d,e"

我正在寻找类似这样的结果:

KEY,VAL
k1,a
k1,b
k2,c
k2,d
k2,e

但上面的天真方法不起作用,因为 every level is connected with the one above it,导致:

with t as (
    select 'k1' as key, 'a,b' as vals
    from dual
    union
    select 'k2' as key, 'c,d,e' as vals
    from dual
)
select key, regexp_substr(vals, '[^,]+', 1, level) as val
from t
connect by LEVEL <= length(vals) - length(replace(vals, ',')) + 1
KEY,VAL
k1,a
k1,b
k2,e
k2,d
k2,e
k2,c
k1,b
k2,e
k2,d
k2,e

我怀疑我需要某种 CONNECT BY PRIOR 条件,但我不确定是什么。尝试按键匹配时:

connect by prior key = key
       and LEVEL <= length(vals) - length(replace(vals, ',')) + 1

我收到 ORA-01436: CONNECT BY loop in user data 错误。

此处正确的方法是什么?

选项 1:简单、快速的字符串函数和递归查询:

with t (key, vals) as (
    SELECT 'k1', 'a,b'   FROM DUAL UNION ALL
    SELECT 'k2', 'c,d,e' FROM DUAL
),
bounds (key, vals, spos, epos) AS (
  SELECT key, vals, 1, INSTR(vals, ',', 1)
  FROM t
UNION ALL
  SELECT key, vals, epos + 1, INSTR(vals, ',', epos + 1)
  FROM bounds
  WHERE  epos > 0
)
SEARCH DEPTH FIRST BY key SET key_order
SELECT key,
       CASE epos
       WHEN 0
       THEN SUBSTR(vals, spos)
       ELSE SUBSTR(vals, spos, epos - spos)
       END AS val
FROM   bounds;

选项 2:LATERAL 连接分层查询中的正则表达式较慢

此选项需要 Oracle 12 或更高版本。

with t (key, vals) as (
    SELECT 'k1', 'a,b'   FROM DUAL UNION ALL
    SELECT 'k2', 'c,d,e' FROM DUAL
)
SELECT key, val
FROM   t
       LEFT OUTER JOIN LATERAL (
         SELECT regexp_substr(vals, '[^,]+', 1, level) AS val
         FROM   DUAL
         CONNECT BY LEVEL <= REGEXP_COUNT(vals, '[^,]+')
       )
       ON (1 = 1)

选项 3:与父行相关的递归查询。

这个选项是最慢的选项,因为它需要在层次结构的级别之间关联并在每个步骤生成一个 GUID(这看似无用但可以防止不必要的递归)。

with t (key, vals) as (
    SELECT 'k1', 'a,b'   FROM DUAL UNION ALL
    SELECT 'k2', 'c,d,e' FROM DUAL
)
SELECT key,
       regexp_substr(vals, '[^,]+', 1, level) AS val
FROM   t
CONNECT BY LEVEL <= REGEXP_COUNT(vals, '[^,]+')
AND PRIOR key = key
AND PRIOR SYS_GUID() IS NOT NULL;

其中全部输出:

KEY VAL
k1 a
k1 b
k2 c
k2 d
k2 e

db<>fiddle here