通过 pl/sql 过程将逗号分隔值与行间隙和空值拆分为 table 中的列
Split comma separated values with line gaps and nulls into columns in table via pl/sql procedure
我在 table 中有一个字符串 clob 值,我需要将其分成几列。
来源 table 查询:
Insert into disp_data(id,data) values(100,
'"Project title as per the outstanding Requirements","The values are not with respect to the requirement and analysis done by the team.
Also it is difficult to prepare a scenario notwithstanding the fact it is difficult. This user story is going to be slightly complex however it is up to the team","Active","Disabled","25 tonnes of fuel","www.examplesites.com/html.asp&net;","","","","","25"');
clob 列值中也有空格、空值和行间隙。
所以当我尝试使用
拆分它时
select regexp_substr(data,'[^,]+',1,level) from disp_data
connect by regexp_substr(data,'[^,]+',1,level) is not null.
问题在于带有行间距的大文本,它会将其拆分为不同的行。我曾想过使用上面的结果集和数据透视表,但我做不到。
我需要将这些数据作为列获取并推入目标 table-push_data_temp。
select pid,col1,col2,col3,col4,col5,col6,col7,col8,col9,col10,col11 from push_data_temp;
clob 列有 11 个逗号分隔值,需要作为列推入此 table。
整个过程需要通过 pl/sql 程序完成。
push_data_temp 中的结果应该如下所示。
任何帮助将不胜感激。
数据库是 oracle 19c
您的正则表达式 needs to allow for nulls,即连续的逗号(但希望您在任何引用的字符串中都没有逗号...)。如果您有多个源行,那么使用递归 CTE 拆分会更容易:
with rcte (id, data, lvl, result) as (
select id, data, 1, regexp_substr(data, '(.*?)(,|$)', 1, 1, null, 1)
from disp_data
union all
select id, data, lvl + 1, regexp_substr(data, '(.*?)(,|$)', 1, lvl + 1, null, 1)
from rcte
where lvl <= regexp_count(data, ',')
)
select id, lvl, result
from rcte
order by id, lvl;
然后您可以将结果旋转到您想要的列中:
with rcte (id, data, lvl, result) as (
select id, data, 1, regexp_substr(data, '(.*?)(,|$)', 1, 1, null, 1)
from disp_data
union all
select id, data, lvl + 1, regexp_substr(data, '(.*?)(,|$)', 1, lvl + 1, null, 1)
from rcte
where lvl <= regexp_count(data, ',')
)
select *
from (
select id, lvl, result
from rcte
)
pivot (max(result) as col for (lvl) in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11));
您可以直接在插入语句中使用它:
insert into push_data_temp (pid,col1,col2,col3,col4,col5,col6,col7,col8,col9,col10,col11)
with rcte (id, data, lvl, result) as (
select id, data, 1, regexp_substr(data, '(.*?)(,|$)', 1, 1, null, 1)
from disp_data
union all
select id, data, lvl + 1, regexp_substr(data, '(.*?)(,|$)', 1, lvl + 1, null, 1)
from rcte
where lvl <= regexp_count(data, ',')
)
select *
from (
select id, lvl, result
from rcte
)
pivot (max(result) as col for (lvl) in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11));
不需要 PL/SQL,但如果需要,您仍然可以将其包装在过程中。
I have to take as clob and it is throwing error as inconsistent datatype
您需要将标记转换为 varchar2
,这会限制它们的长度(4k 或 32k,具体取决于 Oracle 版本和设置):
with rcte (id, data, lvl, result) as (
select id, data, 1,
cast(regexp_substr(data, '(.*?)(,|$)', 1, 1, null, 1) as varchar2(4000))
from disp_data
union all
select id, data, lvl + 1,
cast(regexp_substr(data, '(.*?)(,|$)', 1, lvl + 1, null, 1) as varchar2(4000))
from rcte
where lvl <= regexp_count(data, ',')
)
...
db<>fiddle 使用 CLOB(并且删除了 connect-by 示例,因为它们会破坏它...)
when i try for text with commas in between, it splits data unevenly.
这就是为什么我说“希望你在任何引用的字符串中都没有逗号”。因为你没有任何真正的空元素 - 你有 ...","","...
而不是 ...,,...
- 你可以跳过我想的那些问题,并使用不同的模式:
with rcte (id, data, lvl, result) as (
select id, data, 1,
cast(regexp_substr(data, '("[^"]*"|[^,]+)', 1, 1, null, 1) as varchar2(4000))
from disp_data
union all
select id, data, lvl + 1,
cast(regexp_substr(data, '("[^"]*"|[^,]+)', 1, lvl + 1, null, 1) as varchar2(4000))
from rcte
where lvl <= regexp_count(data, '("[^"]*"|[^,]+)')
)
...
如果您确实必须处理 null 元素,那么它仍然是可能的,but more work。这也不会处理没有字符串的转义 double-quotes 。在某些时候,在 PL/SQL 中编写自己的解析器会更容易;甚至将数据写入磁盘并作为外部 table 读回,它可以为您处理所有这些。
输入Polymorphic Table Functions!
你可以使用这些来动态转换comma-separated strings into a list of columns:
create table disp_data (
id int, data varchar2(1000)
);
Insert into disp_data(id,data) values(100,
'"Project title as per the outstanding Requirements","The values are not with respect to the requirement and analysis done by the team.
Also it is difficult to prepare a scenario notwithstanding the fact it is difficult. This user story is going to be slightly complex however it is up to the team","Active","Disabled","25 tonnes of fuel","www.examplesites.com/html.asp&net;","","","","","25"');
commit;
create or replace package csv_pkg as
/* The describe function defines the new columns */
function describe (
tab in out dbms_tf.table_t,
col_names varchar2
) return dbms_tf.describe_t;
/* Fetch_rows sets the values for the new columns */
procedure fetch_rows (col_names varchar2);
end csv_pkg;
/
create or replace package body csv_pkg as
function describe(
tab in out dbms_tf.table_t,
col_names varchar2
)
return dbms_tf.describe_t as
new_cols dbms_tf.columns_new_t;
col_id pls_integer := 2;
begin
/* Enable the source colun for reading */
tab.column(1).pass_through := FALSE;
tab.column(1).for_read := TRUE;
new_cols(1) := tab.column(1).description;
/* Extract the column names from the header string,
creating a new column for each
*/
for j in 1 .. ( length(col_names) - length(replace(col_names,',')) ) + 1 loop
new_cols(col_id) := dbms_tf.column_metadata_t(
name=>regexp_substr(col_names, '[^,]+', 1, j),--'c'||j,
type=>dbms_tf.type_varchar2
);
col_id := col_id + 1;
end loop;
return dbms_tf.describe_t( new_columns => new_cols );
end;
procedure fetch_rows (col_names varchar2) as
rowset dbms_tf.row_set_t;
row_count pls_integer;
begin
/* read the input data set */
dbms_tf.get_row_set(rowset, row_count => row_count);
/* Loop through the input rows... */
for i in 1 .. row_count loop
/* ...and the defined columns, extracting the relevant value
start from 2 to skip the input string
*/
for j in 2 .. ( length(col_names) - length(replace(col_names,',')) ) + 2 loop
rowset(j).tab_varchar2(i) :=
regexp_substr(rowset(1).tab_varchar2(i), '[^,]+', 1, j - 1);
end loop;
end loop;
/* Output the new columns and their values */
dbms_tf.put_row_set(rowset);
end;
end csv_pkg;
/
create or replace function csv_to_columns(
tab table, col_names varchar2
) return table pipelined row polymorphic using csv_pkg;
/
with rws as (
select data from disp_data
)
select c1, c2, c4, c4, c5, c6, c11
from csv_to_columns (
rws, 'c1,c2,c3,c4,c5,c6,c7,c8,c9,c10,c11'
);
C1 C2 C4 C4 C5 C6 C11
-------------------- ------------------------------ ---------- ---------- ---------- -------------------- ----------
"Project title as pe "The values are not with respe "Disabled" "Disabled" "25 tonnes "www.examplesites.co "25"
r the outstanding Re ct to the requirement and anal of fuel" m/html.asp&net;"
quirements" ysis done by the team.
Also it is difficult to prepar
e a scenario notwithstanding t
he fact it is difficult. This
user story is going to be slig
htly complex however it is up
to the team"
我在 table 中有一个字符串 clob 值,我需要将其分成几列。 来源 table 查询:
Insert into disp_data(id,data) values(100,
'"Project title as per the outstanding Requirements","The values are not with respect to the requirement and analysis done by the team.
Also it is difficult to prepare a scenario notwithstanding the fact it is difficult. This user story is going to be slightly complex however it is up to the team","Active","Disabled","25 tonnes of fuel","www.examplesites.com/html.asp&net;","","","","","25"');
clob 列值中也有空格、空值和行间隙。 所以当我尝试使用
拆分它时select regexp_substr(data,'[^,]+',1,level) from disp_data
connect by regexp_substr(data,'[^,]+',1,level) is not null.
问题在于带有行间距的大文本,它会将其拆分为不同的行。我曾想过使用上面的结果集和数据透视表,但我做不到。
我需要将这些数据作为列获取并推入目标 table-push_data_temp。
select pid,col1,col2,col3,col4,col5,col6,col7,col8,col9,col10,col11 from push_data_temp;
clob 列有 11 个逗号分隔值,需要作为列推入此 table。 整个过程需要通过 pl/sql 程序完成。
push_data_temp 中的结果应该如下所示。
任何帮助将不胜感激。 数据库是 oracle 19c
您的正则表达式 needs to allow for nulls,即连续的逗号(但希望您在任何引用的字符串中都没有逗号...)。如果您有多个源行,那么使用递归 CTE 拆分会更容易:
with rcte (id, data, lvl, result) as (
select id, data, 1, regexp_substr(data, '(.*?)(,|$)', 1, 1, null, 1)
from disp_data
union all
select id, data, lvl + 1, regexp_substr(data, '(.*?)(,|$)', 1, lvl + 1, null, 1)
from rcte
where lvl <= regexp_count(data, ',')
)
select id, lvl, result
from rcte
order by id, lvl;
然后您可以将结果旋转到您想要的列中:
with rcte (id, data, lvl, result) as (
select id, data, 1, regexp_substr(data, '(.*?)(,|$)', 1, 1, null, 1)
from disp_data
union all
select id, data, lvl + 1, regexp_substr(data, '(.*?)(,|$)', 1, lvl + 1, null, 1)
from rcte
where lvl <= regexp_count(data, ',')
)
select *
from (
select id, lvl, result
from rcte
)
pivot (max(result) as col for (lvl) in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11));
您可以直接在插入语句中使用它:
insert into push_data_temp (pid,col1,col2,col3,col4,col5,col6,col7,col8,col9,col10,col11)
with rcte (id, data, lvl, result) as (
select id, data, 1, regexp_substr(data, '(.*?)(,|$)', 1, 1, null, 1)
from disp_data
union all
select id, data, lvl + 1, regexp_substr(data, '(.*?)(,|$)', 1, lvl + 1, null, 1)
from rcte
where lvl <= regexp_count(data, ',')
)
select *
from (
select id, lvl, result
from rcte
)
pivot (max(result) as col for (lvl) in (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11));
不需要 PL/SQL,但如果需要,您仍然可以将其包装在过程中。
I have to take as clob and it is throwing error as inconsistent datatype
您需要将标记转换为 varchar2
,这会限制它们的长度(4k 或 32k,具体取决于 Oracle 版本和设置):
with rcte (id, data, lvl, result) as (
select id, data, 1,
cast(regexp_substr(data, '(.*?)(,|$)', 1, 1, null, 1) as varchar2(4000))
from disp_data
union all
select id, data, lvl + 1,
cast(regexp_substr(data, '(.*?)(,|$)', 1, lvl + 1, null, 1) as varchar2(4000))
from rcte
where lvl <= regexp_count(data, ',')
)
...
db<>fiddle 使用 CLOB(并且删除了 connect-by 示例,因为它们会破坏它...)
when i try for text with commas in between, it splits data unevenly.
这就是为什么我说“希望你在任何引用的字符串中都没有逗号”。因为你没有任何真正的空元素 - 你有 ...","","...
而不是 ...,,...
- 你可以跳过我想的那些问题,并使用不同的模式:
with rcte (id, data, lvl, result) as (
select id, data, 1,
cast(regexp_substr(data, '("[^"]*"|[^,]+)', 1, 1, null, 1) as varchar2(4000))
from disp_data
union all
select id, data, lvl + 1,
cast(regexp_substr(data, '("[^"]*"|[^,]+)', 1, lvl + 1, null, 1) as varchar2(4000))
from rcte
where lvl <= regexp_count(data, '("[^"]*"|[^,]+)')
)
...
如果您确实必须处理 null 元素,那么它仍然是可能的,but more work。这也不会处理没有字符串的转义 double-quotes 。在某些时候,在 PL/SQL 中编写自己的解析器会更容易;甚至将数据写入磁盘并作为外部 table 读回,它可以为您处理所有这些。
输入Polymorphic Table Functions!
你可以使用这些来动态转换comma-separated strings into a list of columns:
create table disp_data (
id int, data varchar2(1000)
);
Insert into disp_data(id,data) values(100,
'"Project title as per the outstanding Requirements","The values are not with respect to the requirement and analysis done by the team.
Also it is difficult to prepare a scenario notwithstanding the fact it is difficult. This user story is going to be slightly complex however it is up to the team","Active","Disabled","25 tonnes of fuel","www.examplesites.com/html.asp&net;","","","","","25"');
commit;
create or replace package csv_pkg as
/* The describe function defines the new columns */
function describe (
tab in out dbms_tf.table_t,
col_names varchar2
) return dbms_tf.describe_t;
/* Fetch_rows sets the values for the new columns */
procedure fetch_rows (col_names varchar2);
end csv_pkg;
/
create or replace package body csv_pkg as
function describe(
tab in out dbms_tf.table_t,
col_names varchar2
)
return dbms_tf.describe_t as
new_cols dbms_tf.columns_new_t;
col_id pls_integer := 2;
begin
/* Enable the source colun for reading */
tab.column(1).pass_through := FALSE;
tab.column(1).for_read := TRUE;
new_cols(1) := tab.column(1).description;
/* Extract the column names from the header string,
creating a new column for each
*/
for j in 1 .. ( length(col_names) - length(replace(col_names,',')) ) + 1 loop
new_cols(col_id) := dbms_tf.column_metadata_t(
name=>regexp_substr(col_names, '[^,]+', 1, j),--'c'||j,
type=>dbms_tf.type_varchar2
);
col_id := col_id + 1;
end loop;
return dbms_tf.describe_t( new_columns => new_cols );
end;
procedure fetch_rows (col_names varchar2) as
rowset dbms_tf.row_set_t;
row_count pls_integer;
begin
/* read the input data set */
dbms_tf.get_row_set(rowset, row_count => row_count);
/* Loop through the input rows... */
for i in 1 .. row_count loop
/* ...and the defined columns, extracting the relevant value
start from 2 to skip the input string
*/
for j in 2 .. ( length(col_names) - length(replace(col_names,',')) ) + 2 loop
rowset(j).tab_varchar2(i) :=
regexp_substr(rowset(1).tab_varchar2(i), '[^,]+', 1, j - 1);
end loop;
end loop;
/* Output the new columns and their values */
dbms_tf.put_row_set(rowset);
end;
end csv_pkg;
/
create or replace function csv_to_columns(
tab table, col_names varchar2
) return table pipelined row polymorphic using csv_pkg;
/
with rws as (
select data from disp_data
)
select c1, c2, c4, c4, c5, c6, c11
from csv_to_columns (
rws, 'c1,c2,c3,c4,c5,c6,c7,c8,c9,c10,c11'
);
C1 C2 C4 C4 C5 C6 C11
-------------------- ------------------------------ ---------- ---------- ---------- -------------------- ----------
"Project title as pe "The values are not with respe "Disabled" "Disabled" "25 tonnes "www.examplesites.co "25"
r the outstanding Re ct to the requirement and anal of fuel" m/html.asp&net;"
quirements" ysis done by the team.
Also it is difficult to prepar
e a scenario notwithstanding t
he fact it is difficult. This
user story is going to be slig
htly complex however it is up
to the team"