如何将带有正则表达式的配置单元查询转换为 oracle
How to convert a hive query with regex to oracle
我有这段文字:
Process explanation:The bottle is then melted to form liquid glass;Final activity for manager:Labeling of previous samples
我只想得到'Process explanation'之后的部分,但不包括'final activity...'
像这样:
The bottle is then melted to form liquid glass.
这是我要转换为 oracle 的当前配置单元查询:
SELECT REGEXP_EXTRACT(
'Process explanation:The bottle is then melted to form liquid glass;Final activity for manager:Labeling of previous samples',
'.*(process[ \t]*(explanation)?[ \t]*:[ \t]*)(.*?)([ \t]*;[ \t]*final[ \t]+activity[ \t]+for[ \t]+manager.*$|$)',
3) as extracted
FROM my_table
如果这些子字符串就像您所说的那样,那么有一个非常简单的选项 - substr
+ instr
函数。
SQL> with test (col) as
2 (select 'Process explanation:The bottle is then melted to form liquid glass;Final activity for manager:Labeling of previous samples' from dual)
3 select substr(col, instr(col, 'Process explanation') + length('Process explanation') + 1,
4 instr(col, 'Final activity') - instr(col, 'Process explanation') -
5 length('Process explanation') - 2
6 ) result
7 from test;
RESULT
----------------------------------------------
The bottle is then melted to form liquid glass
SQL>
我想到了这样的事情:
with strings as
(SELECT '1Process explanation:The bottle is then melted to form liquid glass;Final activity for manager:Labeling of previous samples' str FROM DUAL
union all
SELECT '2Process explanation:The bottle is then melted to form liquid glass;' str FROM DUAL
union all
SELECT '3Process :The bottle is then melted to form liquid glass' str FROM DUAL
union all
SELECT '4Process explanation: plasma gasification combined with centrifugal activity' str FROM DUAL
union all
SELECT '5Final activity for manager:Labeling of previous samples' str FROM DUAL
)
SELECT str
, REGEXP_SUBSTR(
str,
'(.*process[[:blank:]]*(explanation)?[[:blank:]]*:[[:blank:]]*)([A-Za-z0-9 ]*)([[:blank:]]*;[[:blank:]]*final[[:blank:]]*activity[[:blank:]]*for[[:blank:]]*manager.*$)?',
1, 1, 'i',3)
as extracted
FROM strings
导致:
STR
EXTRACTED
1Process explanation:The bottle is then melted to form liquid glass;Final activity for manager:Labeling of previous samples
The bottle is then melted to form liquid glass
2Process explanation:The bottle is then melted to form liquid glass;
The bottle is then melted to form liquid glass
3Process :The bottle is then melted to form liquid glass
The bottle is then melted to form liquid glass
4Process explanation: plasma gasification combined with centrifugal activity
plasma gasification combined with centrifugal activity
5Final activity for manager:Labeling of previous samples
-
假设匹配空白组而不是你的 space 和选项卡列表 [ \t] 是可以的。
编辑:稍微修改了正则表达式,因为最后一组可能为空 '.*' 一直捕获整行。
我有这段文字:
Process explanation:The bottle is then melted to form liquid glass;Final activity for manager:Labeling of previous samples
我只想得到'Process explanation'之后的部分,但不包括'final activity...'
像这样:
The bottle is then melted to form liquid glass.
这是我要转换为 oracle 的当前配置单元查询:
SELECT REGEXP_EXTRACT(
'Process explanation:The bottle is then melted to form liquid glass;Final activity for manager:Labeling of previous samples',
'.*(process[ \t]*(explanation)?[ \t]*:[ \t]*)(.*?)([ \t]*;[ \t]*final[ \t]+activity[ \t]+for[ \t]+manager.*$|$)',
3) as extracted
FROM my_table
如果这些子字符串就像您所说的那样,那么有一个非常简单的选项 - substr
+ instr
函数。
SQL> with test (col) as
2 (select 'Process explanation:The bottle is then melted to form liquid glass;Final activity for manager:Labeling of previous samples' from dual)
3 select substr(col, instr(col, 'Process explanation') + length('Process explanation') + 1,
4 instr(col, 'Final activity') - instr(col, 'Process explanation') -
5 length('Process explanation') - 2
6 ) result
7 from test;
RESULT
----------------------------------------------
The bottle is then melted to form liquid glass
SQL>
我想到了这样的事情:
with strings as
(SELECT '1Process explanation:The bottle is then melted to form liquid glass;Final activity for manager:Labeling of previous samples' str FROM DUAL
union all
SELECT '2Process explanation:The bottle is then melted to form liquid glass;' str FROM DUAL
union all
SELECT '3Process :The bottle is then melted to form liquid glass' str FROM DUAL
union all
SELECT '4Process explanation: plasma gasification combined with centrifugal activity' str FROM DUAL
union all
SELECT '5Final activity for manager:Labeling of previous samples' str FROM DUAL
)
SELECT str
, REGEXP_SUBSTR(
str,
'(.*process[[:blank:]]*(explanation)?[[:blank:]]*:[[:blank:]]*)([A-Za-z0-9 ]*)([[:blank:]]*;[[:blank:]]*final[[:blank:]]*activity[[:blank:]]*for[[:blank:]]*manager.*$)?',
1, 1, 'i',3)
as extracted
FROM strings
导致:
STR | EXTRACTED |
---|---|
1Process explanation:The bottle is then melted to form liquid glass;Final activity for manager:Labeling of previous samples | The bottle is then melted to form liquid glass |
2Process explanation:The bottle is then melted to form liquid glass; | The bottle is then melted to form liquid glass |
3Process :The bottle is then melted to form liquid glass | The bottle is then melted to form liquid glass |
4Process explanation: plasma gasification combined with centrifugal activity | plasma gasification combined with centrifugal activity |
5Final activity for manager:Labeling of previous samples | - |
假设匹配空白组而不是你的 space 和选项卡列表 [ \t] 是可以的。 编辑:稍微修改了正则表达式,因为最后一组可能为空 '.*' 一直捕获整行。