Oracle 正则表达式子字符串前后 n 个字符,不包括空格

Oracle regexp n characters before and after substring excluding whitespaces

我想 select oracle 字符串中特定子字符串前后的 10 个字符。 如果我们将 table 的内容视为:

with tmp as (
select 'aa bbbb cccc xx ddddd eeeeeee' as main_text,
'xx' as sub_text
from dual
)
select * from dual;

我希望输出为:
aa bbbb cccc xx ddddd eeeee

所以我想排除空格并在 'xx'.

子字符串的左侧和右侧计数 10

有空格很简单,但在单个查询中没有空格;我无法想出一个逻辑。

请专家帮忙。使用 Oracle 11g :D

您可以构造一个正则表达式来执行此操作:

select tmp.*,
       regexp_substr(main_text, '.{0,10}' || sub_text || '.{0,10}')
from tmp;

注意:这 return 与您指定的不完全相同,因为这会将空格计为一个字符。

如果你不想计算空格,你可以忽略它们:

select tmp.*,
       regexp_substr(main_text, '([^ ] *){0,10}' || sub_text || '( *[^ ] *){0,10}')
from tmp;

Here 是一个 db<>fiddle.

可以使用REGEXP_SUBSTR和捕获组来获取前后的子串:

with tmp ( main_text, sub_text ) as (
  SELECT 'aa bbbb cccc xx ddddd eeeeeee', 'xx' FROM DUAL
)
SELECT t.*,
       REGEXP_SUBSTR(
         main_text,
         '((\S\s*){0,10})' || sub_text || '((\s*\S){0,10})',
         1,
         1,
         NULL,
         1
       ) AS before_text,
       REGEXP_SUBSTR(
         main_text,
         '((\S\s*){0,10})' || sub_text || '((\s*\S){0,10})',
         1,
         1,
         NULL,
         3
       ) AS after_text
FROM   tmp t;

输出:

MAIN_TEXT                     | SUB_TEXT | BEFORE_TEXT   | AFTER_TEXT  
:---------------------------- | :------- | :------------ | :-----------
aa bbbb cccc xx ddddd eeeeeee | xx       | aa bbbb cccc  |  ddddd eeeee

如果你想删除空格,那么:

with tmp ( main_text, sub_text ) as (
  SELECT 'aa bbbb cccc xx ddddd eeeeeee', 'xx' FROM DUAL
)
SELECT t.*,
       REGEXP_REPLACE(
         REGEXP_SUBSTR(
           main_text,
           '((\S\s*){0,10})' || sub_text || '((\s*\S){0,10})',
           1,
           1,
           NULL,
           1
         ),
         '\s+'
       ) AS before_text,
       REGEXP_REPLACE(
         REGEXP_SUBSTR(
           main_text,
           '((\S\s*){0,10})' || sub_text || '((\s*\S){0,10})',
           1,
           1,
           NULL,
           3
         ),
         '\s+'
       ) AS after_text
FROM   tmp t;

输出:

MAIN_TEXT                     | SUB_TEXT | BEFORE_TEXT | AFTER_TEXT
:---------------------------- | :------- | :---------- | :---------
aa bbbb cccc xx ddddd eeeeeee | xx       | aabbbbcccc  | dddddeeeee

db<>fiddle here

谢谢@Gordon 和@MT0。使用您的两种解决方案中的想法:

with tmp as (
select
    'aa bbbb cccc xx ddddd eeeeeee' as main_text,
    'xx' as sub_text
from dual
)
select tmp.*,
       regexp_substr(
            main_text, 
            '(([^ ] *){0,10})' || sub_text || '(( *[^ ] *){0,10})'
       ) as plus_minus_ten,
       regexp_substr(
            main_text, 
            '(([^ ] *){0,10})' || sub_text || '(( *[^ ] *){0,10})', 
            1, 
            1, 
            NULL, 
            1 
       ) as before_sub_text,
       regexp_substr(main_text, 
            '(([^ ] *){0,10})' || sub_text || '(( *[^ ] *){0,10})', 
            1, 
            1, 
            NULL,
            3
       ) as after_sub_text
from tmp;