在 MS SQL 中搜索 XML、Base64、UTF-8 字符串?

Search XML, Base64, UTF-8 string in MS SQL?

我有一个包含 dataContract XML 字符串的列,如下所示:

<ReportHandler.ReportWrapper
    xmlns:i="http://www.w3.org/2001/XMLSchema-instance"
    xmlns="http://schemas.datacontract.org/2004/07/MyApp.Client.Main.GUI.Report">
    <ClassInfoList
        xmlns:d2p1="http://schemas.datacontract.org/2004/07/MyApp.Service.External.ServiceContracts.Internal.Report" />
    <Report>77u/PD94bWwgdmVyc2lvbj0iMS4wIiBlbmNvZGluZz0idXRmLTgiPz4KPFh0cmFSZXBvcnRzTGF5b3V0U2VyaWFsaXplciBTZXJpYWxpemVyVmVyc2lvbj0iMTkuMi43LjAiIFJlZj0iMCIgQ29udHJvbFR5cGU9Ik15QXBwLkNsaWVudC5NYWluLkdVSS5SZXBvcnQucmVwUmVwb3J0LCBNeUFwcCwgVmVyc2lvbj01LjEzLjAuMjA4NTcsIEN1bHR1cmU9bmV1dHJhbCwgUHVibGljS2V5VG9rZW49bnVsbCIgTmFtZT0icmVwUmVwb3J0IiBNYXJnaW5zPSIxMDAsIDgwLCAyMCwgMTAwIiBQYXBlcktpbmQ9IkE0IiBQYWdlV2lkdGg9IjgyNyIgUGFnZUhlaWdodD0iMTE2OSIgVmVyc2lvbj0iMTkuMiIgUmVxdWVzdFBhcmFtZXRlcnM9ImZhbHNlIiBFdmVudHNJbmZvPSJyZXBSZXBvcnQsQmVmb3JlUHJpbnQscmVwUmVwb3J0X0JlZm9yZVByaW50O3JlcFJlcG9ydCxEZXNpZ25lckxvYWRlZCxyZXBSZXBvcnRfRGVzaWduZXJMb2FkZWQiPgogIDxCYW5kcz4KICAgIDxJdGVtMSBSZWY9IjEiIENvbnRyb2xUeXBlPSJEZXRhaWxCYW5kIiBOYW1lPSJEZXRhaWwiIEhlaWdodEY9IjIwNi42MjQ5ODUiIFRleHRBbGlnbm1lbnQ9IlRvcExlZnQiIFBhZGRpbmc9IjAsMCwwLDAsMTAwIiAvPgogICAgPEl0ZW0yIFJlZj0iMiIgQ29udHJvbFR5cGU9IlBhZ2VIZWFkZXJCYW5kIiBOYW1lPSJQYWdlSGVhZGVyIiBIZWlnaHRGPSIzMCIgVGV4dEFsaWdubWVudD0iVG9wTGVmdCIgUGFkZGluZz0iMCwwLDAsMCwxMDAiIC8+CiAgICA8SXRlbTMgUmVmPSIzIiBDb250cm9sVHlwZT0iUGFnZUZvb3RlckJhbmQiIE5hbWU9IlBhZ2VGb290ZXIiIEhlaWdodEY9IjMwIiBUZXh0QWxpZ25tZW50PSJUb3BMZWZ0IiBQYWRkaW5nPSIwLDAsMCwwLDEwMCIgLz4KICAgIDxJdGVtNCBSZWY9IjQiIENvbnRyb2xUeXBlPSJUb3BNYXJnaW5CYW5kIiBOYW1lPSJ0b3BNYXJnaW5CYW5kMSIgSGVpZ2h0Rj0iMjAiIC8+CiAgICA8SXRlbTUgUmVmPSI1IiBDb250cm9sVHlwZT0iQm90dG9tTWFyZ2luQmFuZCIgTmFtZT0iYm90dG9tTWFyZ2luQmFuZDEiIC8+CiAgPC9CYW5kcz4KPC9YdHJhUmVwb3J0c0xheW91dFNlcmlhbGl6ZXI+</Report>
</ReportHandler.ReportWrapper>

现在我需要提取并解码 XML 标签中的内部 base64 UTF-8 字符串以搜索特定数据。

[base64 UTF-8 string]

最好的解决方案似乎是创建一个带有 coursor 的 SQL StoredProcedure 并 运行 通过每一行,使用 CHARINDEX() 和 SUBSTRING 提取 base64 字符串,然后解码内部 base64 UTF -8 字符串。

所以像这样:

DECLARE @id INT)
DECLARE db_cursor CURSOR FOR SELECT Id
OPEN db_cursor
FETCH NEXT FROM db_cursor INTO @id

WHILE @@FETCH_STATUS = 0  
BEGIN
    //Find start index with CHARINDEX('<Report>', Column) and end index with CHARINDEX('</Report>', Column)
    //Use SUBSTRING on column with start and end index found to get Base64 string
    //Convert the [Base64 UTF-8 to searchable string Select Convert Cast xs:base64Binary][1] Mybe we need to convert to UTF-8 as well some way.
    //Do a select against the result string to see if it contains specific data
    //If it does contain specific data, log it to another table with simple insert
END
CLOSE db_cursor  
DEALLOCATE db_cursor 

有没有更好的方法来完成这项工作?

您有一个 XML 文档嵌入到另一个文档中,从 UTF-8 编码为 base64。

要将其转换回来,请执行以下步骤

  • Select 使用 XQuery 的值 (/ReportHandler.ReportWrapper/Report/text())[1]
  • 您需要使用 WITH XMLNAMESPACES (DEFAULT 'http://schemas.datacontract.org/2004/07/MyApp.Client.Main.GUI.Report')
  • 添加命名空间
  • 使用 xs:base64Binary()
  • 从 base64 转换为 varbinary
  • 使用 CONVERT
  • 将其转换为 xml
WITH XMLNAMESPACES (DEFAULT 'http://schemas.datacontract.org/2004/07/MyApp.Client.Main.GUI.Report')
SELECT CONVERT(xml,
    @xml.value(
    'xs:base64Binary((/ReportHandler.ReportWrapper/Report/text())[1])',
    'varbinary(max)')
)

Replace @xml with your column name in a SELECT table query

结果

<XtraReportsLayoutSerializer SerializerVersion="19.2.7.0" Ref="0" ControlType="MyApp.Client.Main.GUI.Report.repReport, MyApp, Version=5.13.0.20857, Culture=neutral, PublicKeyToken=null" Name="repReport" Margins="100, 80, 20, 100" PaperKind="A4" PageWidth="827" PageHeight="1169" Version="19.2" RequestParameters="false" EventsInfo="repReport,BeforePrint,repReport_BeforePrint;repReport,DesignerLoaded,repReport_DesignerLoaded"><Bands><Item1 Ref="1" ControlType="DetailBand" Name="Detail" HeightF="206.624985" TextAlignment="TopLeft" Padding="0,0,0,0,100" /><Item2 Ref="2" ControlType="PageHeaderBand" Name="PageHeader" HeightF="30" TextAlignment="TopLeft" Padding="0,0,0,0,100" /><Item3 Ref="3" ControlType="PageFooterBand" Name="PageFooter" HeightF="30" TextAlignment="TopLeft" Padding="0,0,0,0,100" /><Item4 Ref="4" ControlType="TopMarginBand" Name="topMarginBand1" HeightF="20" /><Item5 Ref="5" ControlType="BottomMarginBand" Name="bottomMarginBand1" /></Bands></XtraReportsLayoutSerializer>

如果需要,您也可以对该结果使用 XQuery。

db<>fiddle