Oracle INSTR 精确匹配

Oracle INSTR exact match

我有以下查询

SQL> select * from RTECS_ABBREV ra
  2  where instr(trim('100 mmol/plate (-S9)'), ra.abbrev) > 0;

ABBREV                         DEFINITION
------------------------------ --------------------------------------------------------------------------------
mmo                            Mutation in Micro-organism
mmol                           millimole
mol                            mole
S                              second

SQL> 

我想要得到以下结果

SQL> select * from RTECS_ABBREV ra
  2  where instr(trim('100 mmol/plate (-S9)'), ra.abbrev) > 0;

ABBREV                         DEFINITION
------------------------------ --------------------------------------------------------------------------------
mmol                           millimole
S                              second

SQL> 

因为 "mmo" 和 "mol" 是 "mmol" 单词的一部分

更多....

看我有以下数据:

with abbr as
(
      select 'mmo' as abbrev from dual union 
      select 'mmol' as abbrev from dual union
      select 'mol' as abbrev from dual union
      select 'ug' as abbrev from dual union
      select 'mg' as abbrev from dual union
      select 'ppm' as abbrev from dual union
      select 'nmol' as abbrev from dual union
      select 'nm' as abbrev from dual union
      select 'ol' as abbrev from dual union
      select 'S' as abbrev from dual

),
main_data  as
(
select '24231' as id_, '10 ug/plate (-S9)' as data_ from dual union 
select '24232' as id_, '1 pph' as data_ from dual union 
select '24233' as id_, '100 mmol/plate (-S9)' as data_ from dual union 
select '24234' as id_, '100 mmol/plate (-S9)' as data_ from dual union 
select '24235' as id_, '1 pph' as data_ from dual union 
select '24236' as id_, '19300 nmol/L (-S9)' as data_ from dual union 
select '24237' as id_, '800 mg/L' as data_ from dual union 
select '24238' as id_, '600 ppm/2H-C (-S9)' as data_ from dual union 
select '24239' as id_, '500 mg/L (-S9)' as data_ from dual union 
select '24240' as id_, '2000 ppm (-S9)' as data_ from dual union 
select '24241' as id_, '100 mmol/plate (-S9)' as data_ from dual union 
select '24242' as id_, '1 pph (-S9)' as data_ from dual union 
select '24243' as id_, 'ihl 2700 ppm' as data_ from dual union 
select '24244' as id_, 'par 10 mmol/L' as data_ from dual union 
select '24245' as id_, 'mul 1 pph/8H-C' as data_ from dual                          
)
select * from main_data

我需要在 "main_data.data_" 中将 "abbr.abbrev" 中出现的任何匹配词替换为另一个字符串(例如:"test")。

例如,对于“100 mmol/plate (-S9)”,我需要:

100 test/plate (-test9) but not,

100 testl/plate (-test9) or 100 testol/plate (-test9)

所以规则似乎是,替换 "abbr.abbrev" 中的整个单词匹配,如果字符串介于 () 之间,则替换任何匹配的字符

鉴于您的示例数据,我认为您需要如下内容:

SELECT * FROM main_data INNER JOIN abbr
    ON REGEXP_LIKE(main_data.data_, '(^|\W)' || abbr.abbrev || '(\W|$)');

我使用上面的正则表达式,因为 Oracle 正则表达式不支持单词边界。在第一组中,我正在检查字符串或 "non-word" 字符的开头(既不是字母数字也不是下划线 _)。在后一个(结束)组中,我正在检查字符串或非单词字符的结尾。

令我印象深刻的是,如果您总是要对给定单位进行某种度量,那么检查字符串(锚 ^)的开头并不是真正必要的。

如果您要进行替换,您需要将 REGEXP_REPLACE() 与上述正则表达式一起使用,而不仅仅是使用 REGEXP_LIKE().