SQL- 提取字符之间的文本
SQL- Extracting Text between characters
这是我的数据的样子。 (我正在尝试确切的电子邮件地址,以便我可以向收件人和抄送人发送电子邮件。)
EmailTO:[url=mailto:Test_Email_1@Yahoo.com] Test_Email_1@Yahoo.com[/url]
EmailCC:[url=mailto:Test_Email_2@Yahoo.com] Test_Email_2@Yahoo.com[/url]
Hello, This is the rest of the email message....
当我 运行 第一个 SQL 我得到了我想要的结果。
Select
Body,
SUBSTRING(Body, CHARINDEX('EmailTO', Body) + 20,CHARINDEX(']',Body)-CHARINDEX('EmailTO',Body)-20) ToEmail
这个returns
ToEmaiL = Test_Email_1@Yahoo.com
但是当我尝试像这样进行第二个 SUBSTRING 时
Select
Body,
SUBSTRING(Body, CHARINDEX('EmailTO', Body) + 20,CHARINDEX(']',Body)-CHARINDEX('EmailTO',Body)-20) ToEmail,
SUBSTRING(Body, CHARINDEX('EmailCC', Body) + 20,CHARINDEX(']',Body)-CHARINDEX('EmailCC',Body)-20) CCEmail --(Simply replacing the EmailTo from the previous line to EmailCC)
From hdIssues
我收到这个错误
"Msg 537, Level 16, State 5, Line 1 Invalid length parameter passed to the LEFT or SUBSTRING function."
感谢任何帮助。
P.S。在我的数据集中,电子邮件地址可以有多个收件人,用分号分隔,如下所示:
[url=mailto:Test_Email_1@Yahoo.com] Test_Email_1@Yahoo.com[/url]; [url=mailto:Test_Email_5@Yahoo.com] Test_Email_5@Yahoo.com[/url]; [url=mailto:Test_Email_8@Yahoo.com] Test_Email_8@Yahoo.com[/url]
我会用regexp_substr
with t1(col) as(
select 'EmailTO:[url=mailto:Test_Email_1@Yahoo.com] Test_Email_1@Yahoo.com[/url]' from dual
)
select regexp_substr(col, '[[:alnum:]._%-]+@[[:alnum:]._%-]+\.com') as res
from t1;
这将拉出两个电子邮件地址,我留下的是因为你在 P.S 中说过。可能存在多个电子邮件地址。您可以修改正则表达式以仅提取每封电子邮件的一个副本。
如果对 TVF 开放
例子
Select A.ID
,B.*
From YourTable A
Cross Apply [dbo].[tvf-Str-Extract](A.Body,'[url=mailto:',']') B
Returns
ID RetSeq RetPos RetVal
1 1 23 Test_Email_1@Yahoo.com
1 2 89 Test_Email_5@Yahoo.com
1 3 155 Test_Email_8@Yahoo.com
1 4 229 Test_Email_2@Yahoo.com
有兴趣的TVF
CREATE FUNCTION [dbo].[tvf-Str-Extract] (@String varchar(max),@Delimiter1 varchar(100),@Delimiter2 varchar(100))
Returns Table
As
Return (
with cte1(N) As (Select 1 From (Values(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) N(N)),
cte2(N) As (Select Top (IsNull(DataLength(@String),0)) Row_Number() over (Order By (Select NULL)) From (Select N=1 From cte1 N1,cte1 N2,cte1 N3,cte1 N4,cte1 N5,cte1 N6) A ),
cte3(N) As (Select 1 Union All Select t.N+DataLength(@Delimiter1) From cte2 t Where Substring(@String,t.N,DataLength(@Delimiter1)) = @Delimiter1),
cte4(N,L) As (Select S.N,IsNull(NullIf(CharIndex(@Delimiter1,@String,s.N),0)-S.N,8000) From cte3 S)
Select RetSeq = Row_Number() over (Order By N)
,RetPos = N
,RetVal = left(RetVal,charindex(@Delimiter2,RetVal)-1)
From (
Select *,RetVal = Substring(@String, N, L)
From cte4
) A
Where charindex(@Delimiter2,RetVal)>1
)
/*
Max Length of String 1MM characters
Declare @String varchar(max) = 'Dear [[FirstName]] [[LastName]], ...'
Select * From [dbo].[tvf-Str-Extract] (@String,'[[',']]')
*/
EDIT - For Body
两个分隔符是 '[/url]'
和 '|||'
。我们通过添加一个唯一的字符串来强制结束定界符。在这种情况下,我选择了 |||
如果您不想要多条记录。删除 CROSS APPLY B
例子
Select A.ID
,B.*
,Body = ltrim(rtrim(C.RetVal))
From @YourTable A
Cross Apply [dbo].[tvf-Str-Extract](A.Body,'[url=mailto:',']') B
Cross Apply [dbo].[tvf-Str-Extract](A.Body+'|||','[/url]','|||') C --- Notice A.Body+'|||'.... this is to force an ending delimiter
Returns
要解决您的查询问题,您需要在第一个“]”字符后开始搜索 EmailCC。否则,您会选择 'EmailCC' 之前第一次出现的 ']' 字符,因此会出现错误。您可以通过为 CHARINDEX().
添加一个 "start_location" 来做到这一点
因此将您的查询更改为以下内容:
Select
Body,
SUBSTRING(Body, CHARINDEX('EmailTO', Body) + 20,CHARINDEX(']',Body)-CHARINDEX('EmailTO',Body)-20) ToEmail,
SUBSTRING(Body, CHARINDEX('EmailCC', Body) + 20,CHARINDEX(']',Body, CHARINDEX('EmailCC', Body))-CHARINDEX('EmailCC',Body)-20) CCEmail
From hdIssues
在此处查看文档:https://docs.microsoft.com/en-us/sql/t-sql/functions/charindex-transact-sql
这是我的数据的样子。 (我正在尝试确切的电子邮件地址,以便我可以向收件人和抄送人发送电子邮件。)
EmailTO:[url=mailto:Test_Email_1@Yahoo.com] Test_Email_1@Yahoo.com[/url]
EmailCC:[url=mailto:Test_Email_2@Yahoo.com] Test_Email_2@Yahoo.com[/url]
Hello, This is the rest of the email message....
当我 运行 第一个 SQL 我得到了我想要的结果。
Select
Body,
SUBSTRING(Body, CHARINDEX('EmailTO', Body) + 20,CHARINDEX(']',Body)-CHARINDEX('EmailTO',Body)-20) ToEmail
这个returns
ToEmaiL = Test_Email_1@Yahoo.com
但是当我尝试像这样进行第二个 SUBSTRING 时
Select
Body,
SUBSTRING(Body, CHARINDEX('EmailTO', Body) + 20,CHARINDEX(']',Body)-CHARINDEX('EmailTO',Body)-20) ToEmail,
SUBSTRING(Body, CHARINDEX('EmailCC', Body) + 20,CHARINDEX(']',Body)-CHARINDEX('EmailCC',Body)-20) CCEmail --(Simply replacing the EmailTo from the previous line to EmailCC)
From hdIssues
我收到这个错误
"Msg 537, Level 16, State 5, Line 1 Invalid length parameter passed to the LEFT or SUBSTRING function."
感谢任何帮助。
P.S。在我的数据集中,电子邮件地址可以有多个收件人,用分号分隔,如下所示:
[url=mailto:Test_Email_1@Yahoo.com] Test_Email_1@Yahoo.com[/url]; [url=mailto:Test_Email_5@Yahoo.com] Test_Email_5@Yahoo.com[/url]; [url=mailto:Test_Email_8@Yahoo.com] Test_Email_8@Yahoo.com[/url]
我会用regexp_substr
with t1(col) as(
select 'EmailTO:[url=mailto:Test_Email_1@Yahoo.com] Test_Email_1@Yahoo.com[/url]' from dual
)
select regexp_substr(col, '[[:alnum:]._%-]+@[[:alnum:]._%-]+\.com') as res
from t1;
这将拉出两个电子邮件地址,我留下的是因为你在 P.S 中说过。可能存在多个电子邮件地址。您可以修改正则表达式以仅提取每封电子邮件的一个副本。
如果对 TVF 开放
例子
Select A.ID
,B.*
From YourTable A
Cross Apply [dbo].[tvf-Str-Extract](A.Body,'[url=mailto:',']') B
Returns
ID RetSeq RetPos RetVal
1 1 23 Test_Email_1@Yahoo.com
1 2 89 Test_Email_5@Yahoo.com
1 3 155 Test_Email_8@Yahoo.com
1 4 229 Test_Email_2@Yahoo.com
有兴趣的TVF
CREATE FUNCTION [dbo].[tvf-Str-Extract] (@String varchar(max),@Delimiter1 varchar(100),@Delimiter2 varchar(100))
Returns Table
As
Return (
with cte1(N) As (Select 1 From (Values(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) N(N)),
cte2(N) As (Select Top (IsNull(DataLength(@String),0)) Row_Number() over (Order By (Select NULL)) From (Select N=1 From cte1 N1,cte1 N2,cte1 N3,cte1 N4,cte1 N5,cte1 N6) A ),
cte3(N) As (Select 1 Union All Select t.N+DataLength(@Delimiter1) From cte2 t Where Substring(@String,t.N,DataLength(@Delimiter1)) = @Delimiter1),
cte4(N,L) As (Select S.N,IsNull(NullIf(CharIndex(@Delimiter1,@String,s.N),0)-S.N,8000) From cte3 S)
Select RetSeq = Row_Number() over (Order By N)
,RetPos = N
,RetVal = left(RetVal,charindex(@Delimiter2,RetVal)-1)
From (
Select *,RetVal = Substring(@String, N, L)
From cte4
) A
Where charindex(@Delimiter2,RetVal)>1
)
/*
Max Length of String 1MM characters
Declare @String varchar(max) = 'Dear [[FirstName]] [[LastName]], ...'
Select * From [dbo].[tvf-Str-Extract] (@String,'[[',']]')
*/
EDIT - For Body
两个分隔符是 '[/url]'
和 '|||'
。我们通过添加一个唯一的字符串来强制结束定界符。在这种情况下,我选择了 |||
如果您不想要多条记录。删除 CROSS APPLY B
例子
Select A.ID
,B.*
,Body = ltrim(rtrim(C.RetVal))
From @YourTable A
Cross Apply [dbo].[tvf-Str-Extract](A.Body,'[url=mailto:',']') B
Cross Apply [dbo].[tvf-Str-Extract](A.Body+'|||','[/url]','|||') C --- Notice A.Body+'|||'.... this is to force an ending delimiter
Returns
要解决您的查询问题,您需要在第一个“]”字符后开始搜索 EmailCC。否则,您会选择 'EmailCC' 之前第一次出现的 ']' 字符,因此会出现错误。您可以通过为 CHARINDEX().
添加一个 "start_location" 来做到这一点因此将您的查询更改为以下内容:
Select
Body,
SUBSTRING(Body, CHARINDEX('EmailTO', Body) + 20,CHARINDEX(']',Body)-CHARINDEX('EmailTO',Body)-20) ToEmail,
SUBSTRING(Body, CHARINDEX('EmailCC', Body) + 20,CHARINDEX(']',Body, CHARINDEX('EmailCC', Body))-CHARINDEX('EmailCC',Body)-20) CCEmail
From hdIssues
在此处查看文档:https://docs.microsoft.com/en-us/sql/t-sql/functions/charindex-transact-sql