正则表达式或子字符串或其他查找字符串的方法
Regexp or substr or another method to find a string
我想达到最好的性能和select一个"string"只有在"DL:"
这个词之后
我有一列 (varchar2) 的值为:
DL:1011909825
Obj:020190004387 DL:8010406429
Obj:020190004388 DL:8010406428
DL:190682
DL:PDL01900940
Obj:020190004322 DL:611913067
所以输出如下:
1011909825
8010406429
8010406428
190682
PDL01900940
611913067
我不是正则表达式专家,但我尝试了 regexp_replace:
regexp_replace(column,'Obj:|DL:','',1, 0, 'i')
差不多好了,但是输出还是不太一样:
1011909825
020190004387 8010406429
020190004388 8010406428
190682
PDL01900940
020190004322 611913067
如何解决这个问题并达到最佳性能?
如果数据总是这样,那么SUBSTR + INSTR
做这个工作:
SQL> with test (col) as
2 (
3 select 'DL:1011909825' from dual union all
4 select 'Obj:020190004387 DL:8010406429' from dual union all
5 select 'Obj:020190004388 DL:8010406428' from dual union all
6 select 'DL:190682' from dual union all
7 select 'DL:PDL01900940' from dual union all
8 select 'Obj:020190004322 DL:611913067' from dual
9 )
10 select col, substr(col, instr(col, 'DL:') + 3) result
11 from test;
COL RESULT
------------------------------ ------------------------------
DL:1011909825 1011909825
Obj:020190004387 DL:8010406429 8010406429
Obj:020190004388 DL:8010406428 8010406428
DL:190682 190682
DL:PDL01900940 PDL01900940
Obj:020190004322 DL:611913067 611913067
6 rows selected.
SQL>
REGEXP_SUBSTR
可能看起来像这样:
<snip>
10 select col,
11 ltrim(regexp_substr(col, 'DL:\w+'), 'DL:') resul
12 from test;
COL RESULT
------------------------------ -----------------------------
DL:1011909825 1011909825
Obj:020190004387 DL:8010406429 8010406429
Obj:020190004388 DL:8010406428 8010406428
DL:190682 190682
DL:PDL01900940 PDL01900940
Obj:020190004322 DL:611913067 611913067
如果有很多数据,这应该比正则表达式快得多。
substr + instr 会有更好的性能,但是如果你想使用正则表达式:
-- substr + instr will have better performance
with s (str) as (
select 'DL:1011909825' from dual union all
select 'Obj:020190004387 DL:8010406429' from dual union all
select 'Obj:020190004388 DL:8010406428' from dual union all
select 'DL:190682' from dual union all
select 'DL:PDL01900940' from dual union all
select 'Obj:020190004322 DL:611913067' from dual)
select str, regexp_substr(str, 'DL:(.*)', 1, 1, null, 1) rs
from s;
STR RS
------------------------------ ------------------------------
DL:1011909825 1011909825
Obj:020190004387 DL:8010406429 8010406429
Obj:020190004388 DL:8010406428 8010406428
DL:190682 190682
DL:PDL01900940 PDL01900940
Obj:020190004322 DL:611913067 611913067
6 rows selected.
你可能会从中得到一些启发。
DL:(.*)
Match 1
1. 1011909825
Match 2
1. 8010406429
Match 3
1. 8010406428
Match 4
1. 190682
Match 5
1. PDL01900940
Match 6
1. 611913067
或者使用regexp_substr
:
with t(str) as
(
select 'DL:1011909825' from dual union all
select 'Obj:020190004387 DL:8010406429' from dual union all
select 'Obj:020190004388 DL:8010406428' from dual union all
select 'DL:190682' from dual union all
select 'DL:PDL01900940' from dual union all
select 'Obj:020190004322 DL:611913067' from dual
)
select regexp_substr(str, '[^DL:]+$') as str
from t;
STR
----------
1011909825
8010406429
8010406428
190682
01900940
611913067
我想达到最好的性能和select一个"string"只有在"DL:"
这个词之后我有一列 (varchar2) 的值为:
DL:1011909825
Obj:020190004387 DL:8010406429
Obj:020190004388 DL:8010406428
DL:190682
DL:PDL01900940
Obj:020190004322 DL:611913067
所以输出如下:
1011909825
8010406429
8010406428
190682
PDL01900940
611913067
我不是正则表达式专家,但我尝试了 regexp_replace:
regexp_replace(column,'Obj:|DL:','',1, 0, 'i')
差不多好了,但是输出还是不太一样:
1011909825
020190004387 8010406429
020190004388 8010406428
190682
PDL01900940
020190004322 611913067
如何解决这个问题并达到最佳性能?
如果数据总是这样,那么SUBSTR + INSTR
做这个工作:
SQL> with test (col) as
2 (
3 select 'DL:1011909825' from dual union all
4 select 'Obj:020190004387 DL:8010406429' from dual union all
5 select 'Obj:020190004388 DL:8010406428' from dual union all
6 select 'DL:190682' from dual union all
7 select 'DL:PDL01900940' from dual union all
8 select 'Obj:020190004322 DL:611913067' from dual
9 )
10 select col, substr(col, instr(col, 'DL:') + 3) result
11 from test;
COL RESULT
------------------------------ ------------------------------
DL:1011909825 1011909825
Obj:020190004387 DL:8010406429 8010406429
Obj:020190004388 DL:8010406428 8010406428
DL:190682 190682
DL:PDL01900940 PDL01900940
Obj:020190004322 DL:611913067 611913067
6 rows selected.
SQL>
REGEXP_SUBSTR
可能看起来像这样:
<snip>
10 select col,
11 ltrim(regexp_substr(col, 'DL:\w+'), 'DL:') resul
12 from test;
COL RESULT
------------------------------ -----------------------------
DL:1011909825 1011909825
Obj:020190004387 DL:8010406429 8010406429
Obj:020190004388 DL:8010406428 8010406428
DL:190682 190682
DL:PDL01900940 PDL01900940
Obj:020190004322 DL:611913067 611913067
如果有很多数据,这应该比正则表达式快得多。
substr + instr 会有更好的性能,但是如果你想使用正则表达式:
-- substr + instr will have better performance
with s (str) as (
select 'DL:1011909825' from dual union all
select 'Obj:020190004387 DL:8010406429' from dual union all
select 'Obj:020190004388 DL:8010406428' from dual union all
select 'DL:190682' from dual union all
select 'DL:PDL01900940' from dual union all
select 'Obj:020190004322 DL:611913067' from dual)
select str, regexp_substr(str, 'DL:(.*)', 1, 1, null, 1) rs
from s;
STR RS
------------------------------ ------------------------------
DL:1011909825 1011909825
Obj:020190004387 DL:8010406429 8010406429
Obj:020190004388 DL:8010406428 8010406428
DL:190682 190682
DL:PDL01900940 PDL01900940
Obj:020190004322 DL:611913067 611913067
6 rows selected.
你可能会从中得到一些启发。
DL:(.*)
Match 1
1. 1011909825
Match 2
1. 8010406429
Match 3
1. 8010406428
Match 4
1. 190682
Match 5
1. PDL01900940
Match 6
1. 611913067
或者使用regexp_substr
:
with t(str) as
(
select 'DL:1011909825' from dual union all
select 'Obj:020190004387 DL:8010406429' from dual union all
select 'Obj:020190004388 DL:8010406428' from dual union all
select 'DL:190682' from dual union all
select 'DL:PDL01900940' from dual union all
select 'Obj:020190004322 DL:611913067' from dual
)
select regexp_substr(str, '[^DL:]+$') as str
from t;
STR
----------
1011909825
8010406429
8010406428
190682
01900940
611913067