如何在 Oracle 中使用 , 和 : 分隔符将 CLOB 对象拆分为多条记录

How to split a CLOB object using , and : delimiter in Oracle into multiple records

我有一个 CLOB 对象示例,如下所示。我想首先使用分隔符“,”将其拆分,并将其保存在临时 table 中以备后用。

ABCDEF:PmId12345RmLn1VlId0,ABCDEF:PmId12345RmLn1VlId0,ABCDEF:PmId12345RmLn1VlId0,ABCDEF:PmId12345RmLn1VlId0,ABCDEF:PmId12345RmLn1VlId0,ABCDEF:PmId12345RmLn1VlId0,ABCDEF:PmId12345RmLn1VlId0,ABCDEF:PmId12345RmLn1VlId0,ABCDEF:PmId12345RmLn1VlId0,ABCDEF:PmId12345RmLn1VlId0,ABCDEF:PmId12345RmLn1VlId0,ABCDEF:PmId12345RmLn1VlId0,ABCDEF:PmId12345RmLn1VlId0,ABCDEF:PmId12345RmLn1VlId0,ABCDEF:PmId12345RmLn1VlId0,ABCDEF:PmId12345RmLn1VlId0

我想在每一行中以下面的格式保存结果。

Column_Name
__________________________
ABCDEF:PmId12345RmLn1VlId0
ABCDEF:PmId12345RmLn1VlId0
ABCDEF:PmId12345RmLn1VlId0

我尝试使用 REGEXP_SUBSTR 函数

select 
    regexp_substr('ABCDEF:PmId12345RmLn1VlId0,ABCDEF:PmId12345RmLn1VlId0,ABCDEF:PmId12345RmLn1VlId0,ABCDEF:PmId12345RmLn1VlId0,ABCDEF:PmId12345RmLn1VlId0', '[^,]+', 1, 1) Column_Name 
from dual;

上面的查询给了我如下的单条记录

Column_Name
__________________________
ABCDEF:PmId12345RmLn1VlId0

谁能帮我解决这个问题。

这是一个使用递归分解子查询的解决方案(Oracle 11.2 及更高版本):

with inputs ( str ) as (
       select to_clob('ABCDEF:PmId12345RmLn1VlId0,ABCDEF:PmId12345RmLn1VlId0,ABCDEF:PmId12345RmLn1VlId0,ABCDEF:PmId12345RmLn1VlId0,ABCDEF:PmId12345RmLn1VlId0')
       from dual
     ),
     prep ( s, n, token, st_pos, end_pos ) as (
       select ',' || str || ',', -1, null, null, 1
         from inputs
       union all
       select s, n+1, substr(s, st_pos, end_pos - st_pos),
              end_pos + 1, instr(s, ',', 1, n+3)
         from prep
         where end_pos != 0
     )
select n as idx, token as column_name
from   prep
where  n > 0;



   IDX COLUMN_NAME
------ ----------------------------
     1 ABCDEF:PmId12345RmLn1VlId0
     2 ABCDEF:PmId12345RmLn1VlId0
     3 ABCDEF:PmId12345RmLn1VlId0
     4 ABCDEF:PmId12345RmLn1VlId0
     5 ABCDEF:PmId12345RmLn1VlId0

备注

您说的是 CLOB,但在您的示例中,您是从 varchar2 字符串中提取的。我添加了 to_clob() 以查看 if/how 这适用于 CLOB。

我使用了 instrsubstr,因为它们经常(通常?)比它们的 regexp 等价物表现得更好甚至更好。

我保存了输入字符串中每个子字符串的"index";在某些情况下,输入字符串中标记的顺序很重要。 (虽然不是在你的例子中,你只是将相同的标记重复了五次。)

如果您需要更好的性能,尤其是当您的 CLOB 非常大时,您最好使用 dbms_lob.substrdbms_lob.instr - 请参阅 Performance of SUBSTR on CLOB, especially Alex Poole's answer, and documentation here: http://docs.oracle.com/cd/B28359_01/appdev.111/b28419/d_lob.htm#BABEAJAD。注意语法差异与常规 substr / instr.

以防万一你真的只是想按照你的例子解析一个长字符串。如果您需要在列表中查看值的索引,请在 select 中包含 "level":

SQL> with tbl(str) as (
     select 'ABCDEF:PmId12345RmLn1VlId0,ABCDEF:PmId12345RmLn1VlId0,ABCDEF:PmId12345RmLn1VlId0,ABCDEF:PmId12345RmLn1VlId0,ABCDEF:PmId12345RmLn1VlId0,ABCDEF:PmI
d12345RmLn1VlId0,ABCDEF:PmId12345RmLn1VlId0,ABCDEF:PmId12345RmLn1VlId0,ABCDEF:PmId12345RmLn1VlId0,ABCDEF:PmId12345RmLn1VlId0,ABCDEF:PmId12345RmLn1VlId0,ABCDEF:P
mId12345RmLn1VlId0,ABCDEF:PmId12345RmLn1VlId0,ABCDEF:PmId12345RmLn1VlId0,ABCDEF:PmId12345RmLn1VlId0,ABCDEF:PmId12345RmLn1VlId0'
       from dual
   )
   select regexp_substr(str, '(.*?)(,|$)', 1, level, NULL, 1) column_name
   from tbl
   connect by regexp_substr(str, '(.*?)(,|$)', 1, level) is not null;

COLUMN_NAME
--------------------------------------------------------------------------------
ABCDEF:PmId12345RmLn1VlId0
ABCDEF:PmId12345RmLn1VlId0
ABCDEF:PmId12345RmLn1VlId0
ABCDEF:PmId12345RmLn1VlId0
ABCDEF:PmId12345RmLn1VlId0
ABCDEF:PmId12345RmLn1VlId0
ABCDEF:PmId12345RmLn1VlId0
ABCDEF:PmId12345RmLn1VlId0
ABCDEF:PmId12345RmLn1VlId0
ABCDEF:PmId12345RmLn1VlId0
ABCDEF:PmId12345RmLn1VlId0
ABCDEF:PmId12345RmLn1VlId0
ABCDEF:PmId12345RmLn1VlId0
ABCDEF:PmId12345RmLn1VlId0
ABCDEF:PmId12345RmLn1VlId0
ABCDEF:PmId12345RmLn1VlId0

16 rows selected.

SQL>