SQL- 提取字符之间的文本

SQL- Extracting Text between characters

这是我的数据的样子。 (我正在尝试确切的电子邮件地址,以便我可以向收件人和抄送人发送电子邮件。)

    EmailTO:[url=mailto:Test_Email_1@Yahoo.com] Test_Email_1@Yahoo.com[/url]             
    EmailCC:[url=mailto:Test_Email_2@Yahoo.com] Test_Email_2@Yahoo.com[/url]           

    Hello, This is the rest of the email message....

当我 运行 第一个 SQL 我得到了我想要的结果。

    Select
    Body,
    SUBSTRING(Body, CHARINDEX('EmailTO', Body) + 20,CHARINDEX(']',Body)-CHARINDEX('EmailTO',Body)-20) ToEmail

这个returns

    ToEmaiL = Test_Email_1@Yahoo.com

但是当我尝试像这样进行第二个 SUBSTRING 时

    Select
    Body,
    SUBSTRING(Body, CHARINDEX('EmailTO', Body) + 20,CHARINDEX(']',Body)-CHARINDEX('EmailTO',Body)-20) ToEmail,
    SUBSTRING(Body, CHARINDEX('EmailCC', Body) + 20,CHARINDEX(']',Body)-CHARINDEX('EmailCC',Body)-20) CCEmail --(Simply replacing the EmailTo from the previous line to EmailCC)
    From hdIssues   

我收到这个错误

    "Msg 537, Level 16, State 5, Line 1 Invalid length parameter passed to the LEFT or SUBSTRING function."

感谢任何帮助。

P.S。在我的数据集中,电子邮件地址可以有多个收件人,用分号分隔,如下所示:

[url=mailto:Test_Email_1@Yahoo.com] Test_Email_1@Yahoo.com[/url]; [url=mailto:Test_Email_5@Yahoo.com] Test_Email_5@Yahoo.com[/url]; [url=mailto:Test_Email_8@Yahoo.com] Test_Email_8@Yahoo.com[/url]

我会用regexp_substr

with t1(col) as(
   select 'EmailTO:[url=mailto:Test_Email_1@Yahoo.com] Test_Email_1@Yahoo.com[/url]' from dual
)

select regexp_substr(col, '[[:alnum:]._%-]+@[[:alnum:]._%-]+\.com') as res
  from t1;

这将拉出两个电子邮件地址,我留下的是因为你在 P.S 中说过。可能存在多个电子邮件地址。您可以修改正则表达式以仅提取每封电子邮件的一个副本。

如果对 TVF 开放

例子

Select A.ID
      ,B.*
 From  YourTable A
 Cross Apply [dbo].[tvf-Str-Extract](A.Body,'[url=mailto:',']') B

Returns

ID  RetSeq  RetPos  RetVal
1   1       23      Test_Email_1@Yahoo.com
1   2       89      Test_Email_5@Yahoo.com
1   3       155     Test_Email_8@Yahoo.com
1   4       229     Test_Email_2@Yahoo.com

有兴趣的TVF

CREATE FUNCTION [dbo].[tvf-Str-Extract] (@String varchar(max),@Delimiter1 varchar(100),@Delimiter2 varchar(100))
Returns Table 
As
Return (  

with   cte1(N)   As (Select 1 From (Values(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) N(N)),
       cte2(N)   As (Select Top (IsNull(DataLength(@String),0)) Row_Number() over (Order By (Select NULL)) From (Select N=1 From cte1 N1,cte1 N2,cte1 N3,cte1 N4,cte1 N5,cte1 N6) A ),
       cte3(N)   As (Select 1 Union All Select t.N+DataLength(@Delimiter1) From cte2 t Where Substring(@String,t.N,DataLength(@Delimiter1)) = @Delimiter1),
       cte4(N,L) As (Select S.N,IsNull(NullIf(CharIndex(@Delimiter1,@String,s.N),0)-S.N,8000) From cte3 S)

Select RetSeq = Row_Number() over (Order By N)
      ,RetPos = N
      ,RetVal = left(RetVal,charindex(@Delimiter2,RetVal)-1) 
 From  (
        Select *,RetVal = Substring(@String, N, L) 
         From  cte4
       ) A
 Where charindex(@Delimiter2,RetVal)>1

)
/*
Max Length of String 1MM characters

Declare @String varchar(max) = 'Dear [[FirstName]] [[LastName]], ...'
Select * From [dbo].[tvf-Str-Extract] (@String,'[[',']]')
*/

EDIT - For Body

两个分隔符是 '[/url]''|||' 。我们通过添加一个唯一的字符串来强制结束定界符。在这种情况下,我选择了 |||

如果您不想要多条记录。删除 CROSS APPLY B

例子

Select A.ID
      ,B.*
      ,Body = ltrim(rtrim(C.RetVal))
 From  @YourTable A
 Cross Apply [dbo].[tvf-Str-Extract](A.Body,'[url=mailto:',']') B
 Cross Apply [dbo].[tvf-Str-Extract](A.Body+'|||','[/url]','|||') C  --- Notice A.Body+'|||'.... this is to force an ending delimiter

Returns

要解决您的查询问题,您需要在第一个“]”字符后开始搜索 EmailCC。否则,您会选择 'EmailCC' 之前第一次出现的 ']' 字符,因此会出现错误。您可以通过为 CHARINDEX().

添加一个 "start_location" 来做到这一点

因此将您的查询更改为以下内容:

    Select
    Body,
    SUBSTRING(Body, CHARINDEX('EmailTO', Body) + 20,CHARINDEX(']',Body)-CHARINDEX('EmailTO',Body)-20) ToEmail,
    SUBSTRING(Body, CHARINDEX('EmailCC', Body) + 20,CHARINDEX(']',Body, CHARINDEX('EmailCC', Body))-CHARINDEX('EmailCC',Body)-20) CCEmail
    From hdIssues

在此处查看文档:https://docs.microsoft.com/en-us/sql/t-sql/functions/charindex-transact-sql