在 SQL 中使用正则表达式提取位于特殊字符之间的电子邮件
Extract Emails that Lying Between Special Characters Using Regex in SQL
如何在 SQL 中使用正则表达式从这种特定的字符串模式中仅提取电子邮件?
我有:
tb_1
Logmessage
Alan Robert <alan.robert@gmail.com> was assigned to <richard@yahoo.com> and <nelson@gmail.com>
Alan Robert <alan.robert@gmail.com> was unassigned to <khanjoyty@gmail.com> and <katy@gmail.com>
我想要的: tb_2
email_1
email_2
email_3
alan.robert@gmail.com
richard@yahoo.com
nelson@gmail.com
alan.robert@gmail.com
khanjoyty@gmail.com
katy@gmail.com
我已经有了一个解决方案,但是 tb_1 table 有大量的行,所以我的查询输出需要太多时间。这就是为什么我认为 regex 可能更节省时间。
我的查询:
with cte as(
Select replace(replace(replace(replace(right(@logmessage, len(logmessage)-charindex('<', logmessage)+1),
Case when logmessage like '%unassigned%' Then ' was unassigned to '
When logmessage like '%assigned%' then ' was assigned to ' End , '.'),' and ', '.'),
'<', '[' ),'>', ']') logmessage
From tb_1)
Select
PARSENAME(logmessage, 3) AS email_3,
PARSENAME(logmessage, 3) AS email_2,
PARSENAME(logmessage, 1) AS email_1
From cte
使用辅助函数
示例或dbFiddle
Declare @YourTable Table (LogID int,[Logmessage] varchar(500)) Insert Into @YourTable Values
(1,'Alan Robert <alan.robert@gmail.com> was assigned to <richard@yahoo.com> and <nelson@gmail.com>')
,(2,'Alan Robert <alan.robert@gmail.com> was unassigned to <khanjoyty@gmail.com> and <katy@gmail.com>')
Select A.LogID
,B.*
From @YourTable A
Cross Apply [dbo].[tvf-Str-Extract-JSON](LogMessage,'<','>') B
结果
LogID RetSeq RetVal
1 1 alan.robert@gmail.com
1 2 richard@yahoo.com
1 3 nelson@gmail.com
2 1 alan.robert@gmail.com
2 2 khanjoyty@gmail.com
2 3 katy@gmail.com
那么调整结果就成了一件小事
有兴趣的TVF
CREATE FUNCTION [dbo].[tvf-Str-Extract-JSON] (@String varchar(max),@Delim1 varchar(100),@Delim2 varchar(100))
Returns Table
As
Return (
Select RetSeq = row_number() over (order by RetSeq)
,RetVal = left(RetVal,charindex(@Delim2,RetVal)-1)
From (
Select RetSeq = [Key]+1
,RetVal = trim(Value)
From OpenJSON( '["'+replace(string_escape(@String,'json'),@Delim1,'","')+'"]' )
) C1
Where charindex(@Delim2,RetVal)>1
)
如何在 SQL 中使用正则表达式从这种特定的字符串模式中仅提取电子邮件?
我有: tb_1
Logmessage |
---|
Alan Robert <alan.robert@gmail.com> was assigned to <richard@yahoo.com> and <nelson@gmail.com> |
Alan Robert <alan.robert@gmail.com> was unassigned to <khanjoyty@gmail.com> and <katy@gmail.com> |
我想要的: tb_2
email_1 | email_2 | email_3 |
---|---|---|
alan.robert@gmail.com | richard@yahoo.com | nelson@gmail.com |
alan.robert@gmail.com | khanjoyty@gmail.com | katy@gmail.com |
我已经有了一个解决方案,但是 tb_1 table 有大量的行,所以我的查询输出需要太多时间。这就是为什么我认为 regex 可能更节省时间。
我的查询:
with cte as(
Select replace(replace(replace(replace(right(@logmessage, len(logmessage)-charindex('<', logmessage)+1),
Case when logmessage like '%unassigned%' Then ' was unassigned to '
When logmessage like '%assigned%' then ' was assigned to ' End , '.'),' and ', '.'),
'<', '[' ),'>', ']') logmessage
From tb_1)
Select
PARSENAME(logmessage, 3) AS email_3,
PARSENAME(logmessage, 3) AS email_2,
PARSENAME(logmessage, 1) AS email_1
From cte
使用辅助函数
示例或dbFiddle
Declare @YourTable Table (LogID int,[Logmessage] varchar(500)) Insert Into @YourTable Values
(1,'Alan Robert <alan.robert@gmail.com> was assigned to <richard@yahoo.com> and <nelson@gmail.com>')
,(2,'Alan Robert <alan.robert@gmail.com> was unassigned to <khanjoyty@gmail.com> and <katy@gmail.com>')
Select A.LogID
,B.*
From @YourTable A
Cross Apply [dbo].[tvf-Str-Extract-JSON](LogMessage,'<','>') B
结果
LogID RetSeq RetVal
1 1 alan.robert@gmail.com
1 2 richard@yahoo.com
1 3 nelson@gmail.com
2 1 alan.robert@gmail.com
2 2 khanjoyty@gmail.com
2 3 katy@gmail.com
那么调整结果就成了一件小事
有兴趣的TVF
CREATE FUNCTION [dbo].[tvf-Str-Extract-JSON] (@String varchar(max),@Delim1 varchar(100),@Delim2 varchar(100))
Returns Table
As
Return (
Select RetSeq = row_number() over (order by RetSeq)
,RetVal = left(RetVal,charindex(@Delim2,RetVal)-1)
From (
Select RetSeq = [Key]+1
,RetVal = trim(Value)
From OpenJSON( '["'+replace(string_escape(@String,'json'),@Delim1,'","')+'"]' )
) C1
Where charindex(@Delim2,RetVal)>1
)