如何使用 SQL 服务器遍历每个潜在的 XML 子元素

How to iterate through every potential XML subelement using SQL Server

我有一个包含超过 45K 联系人的大型 XML 文件,我需要将他们的子元素事务迭代到 SQL table。我已经查看了几种解决方案,使用 value()、node() 等...,但似乎没有示例具有接近我的 XML 结构:

<Contacts>
    <Contact>
        <ContactID>1234</ContactID>
        <ContactName>’John Doe’</ContactName>
        <DOB>09031978</DOB>
        <Address>’123 Main Street’</Address>
            <Transactions>
                <Transaction>
                <TransactionID>4490</TransactionID>
                <ProductName>’Recliner’</ProductName>
                <Cost>123.00</Cost>
                <PurchaseDate>07042020
                </Transaction>
                <Transaction>
                <TransactionID>5678</TransactionID>
                <ProductName>’Lamp’</ProductName>
                <Cost>45.00</Cost>
                <PurchaseDate>07042020
                <Transaction>
            </Transactions>
    </Contact>
    <Contact>
        <ContactID>4567</ContactID>
        <ContactName>’Jane Doe’</ContactName>
        <DOB>05191984</DOB>
        <Address>’567 Fake Street’</Address>
            <Transactions>
                <Transaction>
                <TransactionID>4378</TransactionID>
                <ProductName>’Coffee Table’</ProductName>
                <Cost>225.00</Cost>
                <PurchaseDate>07042018
                </Transaction>
            </Transactions>
    </Contact>
</Contacts>

我需要这些数据,如下所示:

ContactID TransactionID ProductName Cost PurchaseDate
1234 4490 Recliner 123.00 4 July 2020
1234 5678 Lamp 45.00 4 July 2020
4567 4378 Coffee Table 225.00 4 July 2018

我尝试使用以下脚本进行查询:

EXEC sp_xml_preparedocument @idoc OUTPUT, @doc  
-- Execute a SELECT stmt using OPENXML rowset provider.  
SELECT *  
FROM OPENXML (@idoc, '/Contacts/Contact/Transactions/Transaction',2)  
WITH (ContactID     int             '../ContactID',  
      TransactionID  int            'TransactionID',  
      ProductName   nvarchar(50)    'ProductName',  
      Cost          float           'Cost',  
      PurchaseDate  date            'PurchaseDate')

但这将 return ContactID 为 null;或 return 每个 ContactID 仅进行一次交易。但我需要它来迭代并获得与联系人一样多的事务。

欢迎任何见解!

尽量避免sp_xml_preparedocument because it uses large amounts of memory that can't be used by SQL Server until you remember to free it up by invoking sp_xml_removedocument.

使用 nodes() 和交叉应用可以轻松实现您的要求,例如(修复您的 XML 示例后):

declare @doc xml = N'<Contacts>
    <Contact>
        <ContactID>1234</ContactID>
        <ContactName>’John Doe’</ContactName>
        <DOB>09031978</DOB>
        <Address>’123 Main Street’</Address>
            <Transactions>
                <Transaction>
                <TransactionID>4490</TransactionID>
                <ProductName>’Recliner’</ProductName>
                <Cost>123.00</Cost>
                <PurchaseDate>07042020</PurchaseDate>
                </Transaction>
                <Transaction>
                <TransactionID>5678</TransactionID>
                <ProductName>’Lamp’</ProductName>
                <Cost>45.00</Cost>
                <PurchaseDate>07042020</PurchaseDate>
                </Transaction>
            </Transactions>
    </Contact>
    <Contact>
        <ContactID>4567</ContactID>
        <ContactName>’Jane Doe’</ContactName>
        <DOB>05191984</DOB>
        <Address>’567 Fake Street’</Address>
            <Transactions>
                <Transaction>
                <TransactionID>4378</TransactionID>
                <ProductName>’Coffee Table’</ProductName>
                <Cost>225.00</Cost>
                <PurchaseDate>07042018</PurchaseDate>
                </Transaction>
            </Transactions>
    </Contact>
</Contacts>';

select
  Cont.value('ContactID[1]', 'int') as ContactID
  ,Trans.value('TransactionID[1]', 'int') as TransactionID
  ,Trans.value('ProductName[1]', 'nvarchar(50)') as ProductName
  ,Trans.value('Cost[1]', 'float') as Cost
  ,convert(date, concat(substring(purDate,1,2), '/', substring(purDate,3,2), '/', substring(purDate,5,4)), 101) as PurchaseDate
from @doc.nodes('//Contact') nodes1(Cont)
cross apply nodes1.Cont.nodes('Transactions/Transaction') nodes2(Trans)
outer apply (
  select purDate = Trans.value('PurchaseDate[1]', 'nvarchar(8)')
) temp;

产生:

ContactID TransactionID ProductName Cost PurchaseDate
1234 4490 ’Recliner’ 123 2020-07-04
1234 5678 ’Lamp’ 45 2020-07-04
4567 4378 ’Coffee Table’ 225 2018-07-04