如何使用 SQL 服务器遍历每个潜在的 XML 子元素
How to iterate through every potential XML subelement using SQL Server
我有一个包含超过 45K 联系人的大型 XML 文件,我需要将他们的子元素事务迭代到 SQL table。我已经查看了几种解决方案,使用 value()、node() 等...,但似乎没有示例具有接近我的 XML 结构:
<Contacts>
<Contact>
<ContactID>1234</ContactID>
<ContactName>’John Doe’</ContactName>
<DOB>09031978</DOB>
<Address>’123 Main Street’</Address>
<Transactions>
<Transaction>
<TransactionID>4490</TransactionID>
<ProductName>’Recliner’</ProductName>
<Cost>123.00</Cost>
<PurchaseDate>07042020
</Transaction>
<Transaction>
<TransactionID>5678</TransactionID>
<ProductName>’Lamp’</ProductName>
<Cost>45.00</Cost>
<PurchaseDate>07042020
<Transaction>
</Transactions>
</Contact>
<Contact>
<ContactID>4567</ContactID>
<ContactName>’Jane Doe’</ContactName>
<DOB>05191984</DOB>
<Address>’567 Fake Street’</Address>
<Transactions>
<Transaction>
<TransactionID>4378</TransactionID>
<ProductName>’Coffee Table’</ProductName>
<Cost>225.00</Cost>
<PurchaseDate>07042018
</Transaction>
</Transactions>
</Contact>
</Contacts>
我需要这些数据,如下所示:
ContactID
TransactionID
ProductName
Cost
PurchaseDate
1234
4490
Recliner
123.00
4 July 2020
1234
5678
Lamp
45.00
4 July 2020
4567
4378
Coffee Table
225.00
4 July 2018
我尝试使用以下脚本进行查询:
EXEC sp_xml_preparedocument @idoc OUTPUT, @doc
-- Execute a SELECT stmt using OPENXML rowset provider.
SELECT *
FROM OPENXML (@idoc, '/Contacts/Contact/Transactions/Transaction',2)
WITH (ContactID int '../ContactID',
TransactionID int 'TransactionID',
ProductName nvarchar(50) 'ProductName',
Cost float 'Cost',
PurchaseDate date 'PurchaseDate')
但这将 return ContactID 为 null;或 return 每个 ContactID 仅进行一次交易。但我需要它来迭代并获得与联系人一样多的事务。
欢迎任何见解!
尽量避免sp_xml_preparedocument because it uses large amounts of memory that can't be used by SQL Server until you remember to free it up by invoking sp_xml_removedocument.
使用 nodes() 和交叉应用可以轻松实现您的要求,例如(修复您的 XML 示例后):
declare @doc xml = N'<Contacts>
<Contact>
<ContactID>1234</ContactID>
<ContactName>’John Doe’</ContactName>
<DOB>09031978</DOB>
<Address>’123 Main Street’</Address>
<Transactions>
<Transaction>
<TransactionID>4490</TransactionID>
<ProductName>’Recliner’</ProductName>
<Cost>123.00</Cost>
<PurchaseDate>07042020</PurchaseDate>
</Transaction>
<Transaction>
<TransactionID>5678</TransactionID>
<ProductName>’Lamp’</ProductName>
<Cost>45.00</Cost>
<PurchaseDate>07042020</PurchaseDate>
</Transaction>
</Transactions>
</Contact>
<Contact>
<ContactID>4567</ContactID>
<ContactName>’Jane Doe’</ContactName>
<DOB>05191984</DOB>
<Address>’567 Fake Street’</Address>
<Transactions>
<Transaction>
<TransactionID>4378</TransactionID>
<ProductName>’Coffee Table’</ProductName>
<Cost>225.00</Cost>
<PurchaseDate>07042018</PurchaseDate>
</Transaction>
</Transactions>
</Contact>
</Contacts>';
select
Cont.value('ContactID[1]', 'int') as ContactID
,Trans.value('TransactionID[1]', 'int') as TransactionID
,Trans.value('ProductName[1]', 'nvarchar(50)') as ProductName
,Trans.value('Cost[1]', 'float') as Cost
,convert(date, concat(substring(purDate,1,2), '/', substring(purDate,3,2), '/', substring(purDate,5,4)), 101) as PurchaseDate
from @doc.nodes('//Contact') nodes1(Cont)
cross apply nodes1.Cont.nodes('Transactions/Transaction') nodes2(Trans)
outer apply (
select purDate = Trans.value('PurchaseDate[1]', 'nvarchar(8)')
) temp;
产生:
ContactID
TransactionID
ProductName
Cost
PurchaseDate
1234
4490
’Recliner’
123
2020-07-04
1234
5678
’Lamp’
45
2020-07-04
4567
4378
’Coffee Table’
225
2018-07-04
我有一个包含超过 45K 联系人的大型 XML 文件,我需要将他们的子元素事务迭代到 SQL table。我已经查看了几种解决方案,使用 value()、node() 等...,但似乎没有示例具有接近我的 XML 结构:
<Contacts>
<Contact>
<ContactID>1234</ContactID>
<ContactName>’John Doe’</ContactName>
<DOB>09031978</DOB>
<Address>’123 Main Street’</Address>
<Transactions>
<Transaction>
<TransactionID>4490</TransactionID>
<ProductName>’Recliner’</ProductName>
<Cost>123.00</Cost>
<PurchaseDate>07042020
</Transaction>
<Transaction>
<TransactionID>5678</TransactionID>
<ProductName>’Lamp’</ProductName>
<Cost>45.00</Cost>
<PurchaseDate>07042020
<Transaction>
</Transactions>
</Contact>
<Contact>
<ContactID>4567</ContactID>
<ContactName>’Jane Doe’</ContactName>
<DOB>05191984</DOB>
<Address>’567 Fake Street’</Address>
<Transactions>
<Transaction>
<TransactionID>4378</TransactionID>
<ProductName>’Coffee Table’</ProductName>
<Cost>225.00</Cost>
<PurchaseDate>07042018
</Transaction>
</Transactions>
</Contact>
</Contacts>
我需要这些数据,如下所示:
ContactID | TransactionID | ProductName | Cost | PurchaseDate |
---|---|---|---|---|
1234 | 4490 | Recliner | 123.00 | 4 July 2020 |
1234 | 5678 | Lamp | 45.00 | 4 July 2020 |
4567 | 4378 | Coffee Table | 225.00 | 4 July 2018 |
我尝试使用以下脚本进行查询:
EXEC sp_xml_preparedocument @idoc OUTPUT, @doc
-- Execute a SELECT stmt using OPENXML rowset provider.
SELECT *
FROM OPENXML (@idoc, '/Contacts/Contact/Transactions/Transaction',2)
WITH (ContactID int '../ContactID',
TransactionID int 'TransactionID',
ProductName nvarchar(50) 'ProductName',
Cost float 'Cost',
PurchaseDate date 'PurchaseDate')
但这将 return ContactID 为 null;或 return 每个 ContactID 仅进行一次交易。但我需要它来迭代并获得与联系人一样多的事务。
欢迎任何见解!
尽量避免sp_xml_preparedocument because it uses large amounts of memory that can't be used by SQL Server until you remember to free it up by invoking sp_xml_removedocument.
使用 nodes() 和交叉应用可以轻松实现您的要求,例如(修复您的 XML 示例后):
declare @doc xml = N'<Contacts>
<Contact>
<ContactID>1234</ContactID>
<ContactName>’John Doe’</ContactName>
<DOB>09031978</DOB>
<Address>’123 Main Street’</Address>
<Transactions>
<Transaction>
<TransactionID>4490</TransactionID>
<ProductName>’Recliner’</ProductName>
<Cost>123.00</Cost>
<PurchaseDate>07042020</PurchaseDate>
</Transaction>
<Transaction>
<TransactionID>5678</TransactionID>
<ProductName>’Lamp’</ProductName>
<Cost>45.00</Cost>
<PurchaseDate>07042020</PurchaseDate>
</Transaction>
</Transactions>
</Contact>
<Contact>
<ContactID>4567</ContactID>
<ContactName>’Jane Doe’</ContactName>
<DOB>05191984</DOB>
<Address>’567 Fake Street’</Address>
<Transactions>
<Transaction>
<TransactionID>4378</TransactionID>
<ProductName>’Coffee Table’</ProductName>
<Cost>225.00</Cost>
<PurchaseDate>07042018</PurchaseDate>
</Transaction>
</Transactions>
</Contact>
</Contacts>';
select
Cont.value('ContactID[1]', 'int') as ContactID
,Trans.value('TransactionID[1]', 'int') as TransactionID
,Trans.value('ProductName[1]', 'nvarchar(50)') as ProductName
,Trans.value('Cost[1]', 'float') as Cost
,convert(date, concat(substring(purDate,1,2), '/', substring(purDate,3,2), '/', substring(purDate,5,4)), 101) as PurchaseDate
from @doc.nodes('//Contact') nodes1(Cont)
cross apply nodes1.Cont.nodes('Transactions/Transaction') nodes2(Trans)
outer apply (
select purDate = Trans.value('PurchaseDate[1]', 'nvarchar(8)')
) temp;
产生:
ContactID | TransactionID | ProductName | Cost | PurchaseDate |
---|---|---|---|---|
1234 | 4490 | ’Recliner’ | 123 | 2020-07-04 |
1234 | 5678 | ’Lamp’ | 45 | 2020-07-04 |
4567 | 4378 | ’Coffee Table’ | 225 | 2018-07-04 |