从执行计划中粉碎 XML
Shredding XML From Execution Plans
首先我会说我讨厌 XML,可怕的东西,但有时是必要的。
我当前的问题是我正在尝试从执行计划(由用户提供,因此可以是任何大小)中获取 XML 并将其分解为 table 以供进一步使用操纵。我现在只有两个选择;
- 我可以计算出可用于执行计划的最大节点数量(我怀疑这会很多)并创建可用于任何 XML 输入的整个脚本。这将是一次性的事情,所以不是问题。
- 另一种方法是动态计算节点数并根据要求创建输出。
过去有人做过类似的练习吗?我发现的所有示例查询都已经知道输出字段。
用于将 XML 分解为元素和属性及其路径和父元素的脚本。来自 http://beyondrelational.com/modules/2/blogs/28/posts/10495/xquery-lab-58-select-from-xml.aspx
CREATE FUNCTION [dbo].[XMLTable](
@x XML
)
RETURNS TABLE
AS RETURN
/*----------------------------------------------------------------------
This INLINE TVF uses a recursive CTE that processes each element and
attribute of the XML document passed in.
----------------------------------------------------------------------*/
WITH cte AS (
/*------------------------------------------------------------------
Anchor part of the recursive query. Retrieves the root element
of the XML document
------------------------------------------------------------------*/
SELECT
1 AS lvl,
x.value('local-name(.)','NVARCHAR(MAX)') AS Name,
CAST(NULL AS NVARCHAR(MAX)) AS ParentName,
CAST(1 AS INT) AS ParentPosition,
CAST(N'Element' AS NVARCHAR(20)) AS NodeType,
x.value('local-name(.)','NVARCHAR(MAX)') AS FullPath,
x.value('local-name(.)','NVARCHAR(MAX)')
+ N'['
+ CAST(ROW_NUMBER() OVER(ORDER BY (SELECT 1)) AS NVARCHAR)
+ N']' AS XPath,
ROW_NUMBER() OVER(ORDER BY (SELECT 1)) AS Position,
x.value('local-name(.)','NVARCHAR(MAX)') AS Tree,
x.value('text()[1]','NVARCHAR(MAX)') AS Value,
x.query('.') AS this,
x.query('*') AS t,
CAST(CAST(1 AS VARBINARY(4)) AS VARBINARY(MAX)) AS Sort,
CAST(1 AS INT) AS ID
FROM @x.nodes('/*') a(x)
UNION ALL
/*------------------------------------------------------------------
Start recursion. Retrieve each child element of the parent node
------------------------------------------------------------------*/
SELECT
p.lvl + 1 AS lvl,
c.value('local-name(.)','NVARCHAR(MAX)') AS Name,
CAST(p.Name AS NVARCHAR(MAX)) AS ParentName,
CAST(p.Position AS INT) AS ParentPosition,
CAST(N'Element' AS NVARCHAR(20)) AS NodeType,
CAST(
p.FullPath
+ N'/'
+ c.value('local-name(.)','NVARCHAR(MAX)') AS NVARCHAR(MAX)
) AS FullPath,
CAST(
p.XPath
+ N'/'
+ c.value('local-name(.)','NVARCHAR(MAX)')
+ N'['
+ CAST(ROW_NUMBER() OVER(
PARTITION BY c.value('local-name(.)','NVARCHAR(MAX)')
ORDER BY (SELECT 1)) AS NVARCHAR )
+ N']' AS NVARCHAR(MAX)
) AS XPath,
ROW_NUMBER() OVER(
PARTITION BY c.value('local-name(.)','NVARCHAR(MAX)')
ORDER BY (SELECT 1)) AS Position,
CAST(
SPACE(2 * p.lvl - 1) + N'|' + REPLICATE(N'-', 1)
+ c.value('local-name(.)','NVARCHAR(MAX)') AS NVARCHAR(MAX)
) AS Tree,
CAST( c.value('text()[1]','NVARCHAR(MAX)') AS NVARCHAR(MAX) ) AS Value,
c.query('.') AS this,
c.query('*') AS t,
CAST(
p.Sort
+ CAST( (lvl + 1) * 1024
+ (ROW_NUMBER() OVER(ORDER BY (SELECT 1)) * 2) AS VARBINARY(4)
) AS VARBINARY(MAX) ) AS Sort,
CAST(
(lvl + 1) * 1024
+ (ROW_NUMBER() OVER(ORDER BY (SELECT 1)) * 2) AS INT
)
FROM cte p
CROSS APPLY p.t.nodes('*') b(c)
), cte2 AS (
SELECT
lvl AS Depth,
Name AS NodeName,
ParentName,
ParentPosition,
NodeType,
FullPath,
XPath,
Position,
Tree AS TreeView,
Value,
this AS XMLData,
Sort, ID
FROM cte
UNION ALL
/*------------------------------------------------------------------
Attributes do not need recursive calls. So add the attributes
to the query output at the end.
------------------------------------------------------------------*/
SELECT
p.lvl,
x.value('local-name(.)','NVARCHAR(MAX)'),
p.Name,
p.Position,
CAST(N'Attribute' AS NVARCHAR(20)),
p.FullPath + N'/@' + x.value('local-name(.)','NVARCHAR(MAX)'),
p.XPath + N'/@' + x.value('local-name(.)','NVARCHAR(MAX)'),
1,
SPACE(2 * p.lvl - 1) + N'|' + REPLICATE('-', 1)
+ N'@' + x.value('local-name(.)','NVARCHAR(MAX)'),
x.value('.','NVARCHAR(MAX)'),
NULL,
p.Sort,
p.ID + 1
FROM cte p
CROSS APPLY this.nodes('/*/@*') a(x)
)
SELECT
ROW_NUMBER() OVER(ORDER BY Sort, ID) AS ID,
ParentName, ParentPosition,Depth, NodeName, Position,
NodeType, FullPath, XPath, TreeView, Value, XMLData
FROM cte2;
go
SELECT * FROM dbo.XMLTable('
<employees>
<emp name="jacob"/>
<emp name="steve">
<phone>123</phone>
some text
</emp>
</employees>
')
一个非常直接的方法可能是这样(而 @x
是你的 XML-执行计划):
DECLARE @x XML=
N'<root>
<ElementE1 AttributA1="A1-text belongs to E1[1]" OneMore="xyz">E1-Text 2</ElementE1>
<ElementE1 AttributA1="A1-text belongs to E1[2]">E1-Text 2</ElementE1>
<ElementParent>
<subElement test="sub"/>
Free text
</ElementParent>
</root>';
DECLARE @idoc INT;
EXEC sp_xml_preparedocument @idoc OUTPUT, @x;
SELECT * FROM OPENXML (@idoc, '*');
EXEC sp_xml_removedocument @idoc;
结果(不是所有列)
+----+----------+----------+--------------+------+--------------------------+
| id | parentid | nodetype | localname | prev | text |
+----+----------+----------+--------------+------+--------------------------+
| 0 | NULL | 1 | root | NULL | NULL |
+----+----------+----------+--------------+------+--------------------------+
| 2 | 0 | 1 | ElementE1 | NULL | NULL |
+----+----------+----------+--------------+------+--------------------------+
| 3 | 2 | 2 | AttributA1 | NULL | NULL |
+----+----------+----------+--------------+------+--------------------------+
| 13 | 3 | 3 | #text | NULL | A1-text belongs to E1[1] |
+----+----------+----------+--------------+------+--------------------------+
| 4 | 2 | 2 | OneMore | NULL | NULL |
+----+----------+----------+--------------+------+--------------------------+
| 14 | 4 | 3 | #text | NULL | xyz |
+----+----------+----------+--------------+------+--------------------------+
| 5 | 2 | 3 | #text | NULL | E1-Text 2 |
+----+----------+----------+--------------+------+--------------------------+
| 6 | 0 | 1 | ElementE1 | 2 | NULL |
+----+----------+----------+--------------+------+--------------------------+
| 7 | 6 | 2 | AttributA1 | NULL | NULL |
+----+----------+----------+--------------+------+--------------------------+
| 15 | 7 | 3 | #text | NULL | A1-text belongs to E1[2] |
+----+----------+----------+--------------+------+--------------------------+
| 8 | 6 | 3 | #text | NULL | E1-Text 2 |
+----+----------+----------+--------------+------+--------------------------+
| 9 | 0 | 1 | ElementParent| 6 | NULL |
+----+----------+----------+--------------+------+--------------------------+
| 10 | 9 | 1 | subElement | NULL | NULL |
+----+----------+----------+--------------+------+--------------------------+
| 11 | 10 | 2 | test | NULL | NULL |
+----+----------+----------+--------------+------+--------------------------+
| 16 | 11 | 3 | #text | NULL | sub |
+----+----------+----------+--------------+------+--------------------------+
| 12 | 9 | 3 | #text | 10 | Free text |
+----+----------+----------+--------------+------+--------------------------+
id
清楚地表明,该算法是 广度优先 ,没有 id=1
(为什么)和 nodetype
允许区分元素、属性和(浮动)文本。 prev
列指向链中的一个兄弟。缺少的列与名称空间有关...
FROM OPENXML
的方法已经过时,但这是一种可能仍然非常有用的罕见情况...
您会得到一个包含 ID 和 ParentID 的列表,您可以使用递归 CTE 查询...这取决于您之后想用它做什么...
首先我会说我讨厌 XML,可怕的东西,但有时是必要的。
我当前的问题是我正在尝试从执行计划(由用户提供,因此可以是任何大小)中获取 XML 并将其分解为 table 以供进一步使用操纵。我现在只有两个选择;
- 我可以计算出可用于执行计划的最大节点数量(我怀疑这会很多)并创建可用于任何 XML 输入的整个脚本。这将是一次性的事情,所以不是问题。
- 另一种方法是动态计算节点数并根据要求创建输出。
过去有人做过类似的练习吗?我发现的所有示例查询都已经知道输出字段。
用于将 XML 分解为元素和属性及其路径和父元素的脚本。来自 http://beyondrelational.com/modules/2/blogs/28/posts/10495/xquery-lab-58-select-from-xml.aspx
CREATE FUNCTION [dbo].[XMLTable](
@x XML
)
RETURNS TABLE
AS RETURN
/*----------------------------------------------------------------------
This INLINE TVF uses a recursive CTE that processes each element and
attribute of the XML document passed in.
----------------------------------------------------------------------*/
WITH cte AS (
/*------------------------------------------------------------------
Anchor part of the recursive query. Retrieves the root element
of the XML document
------------------------------------------------------------------*/
SELECT
1 AS lvl,
x.value('local-name(.)','NVARCHAR(MAX)') AS Name,
CAST(NULL AS NVARCHAR(MAX)) AS ParentName,
CAST(1 AS INT) AS ParentPosition,
CAST(N'Element' AS NVARCHAR(20)) AS NodeType,
x.value('local-name(.)','NVARCHAR(MAX)') AS FullPath,
x.value('local-name(.)','NVARCHAR(MAX)')
+ N'['
+ CAST(ROW_NUMBER() OVER(ORDER BY (SELECT 1)) AS NVARCHAR)
+ N']' AS XPath,
ROW_NUMBER() OVER(ORDER BY (SELECT 1)) AS Position,
x.value('local-name(.)','NVARCHAR(MAX)') AS Tree,
x.value('text()[1]','NVARCHAR(MAX)') AS Value,
x.query('.') AS this,
x.query('*') AS t,
CAST(CAST(1 AS VARBINARY(4)) AS VARBINARY(MAX)) AS Sort,
CAST(1 AS INT) AS ID
FROM @x.nodes('/*') a(x)
UNION ALL
/*------------------------------------------------------------------
Start recursion. Retrieve each child element of the parent node
------------------------------------------------------------------*/
SELECT
p.lvl + 1 AS lvl,
c.value('local-name(.)','NVARCHAR(MAX)') AS Name,
CAST(p.Name AS NVARCHAR(MAX)) AS ParentName,
CAST(p.Position AS INT) AS ParentPosition,
CAST(N'Element' AS NVARCHAR(20)) AS NodeType,
CAST(
p.FullPath
+ N'/'
+ c.value('local-name(.)','NVARCHAR(MAX)') AS NVARCHAR(MAX)
) AS FullPath,
CAST(
p.XPath
+ N'/'
+ c.value('local-name(.)','NVARCHAR(MAX)')
+ N'['
+ CAST(ROW_NUMBER() OVER(
PARTITION BY c.value('local-name(.)','NVARCHAR(MAX)')
ORDER BY (SELECT 1)) AS NVARCHAR )
+ N']' AS NVARCHAR(MAX)
) AS XPath,
ROW_NUMBER() OVER(
PARTITION BY c.value('local-name(.)','NVARCHAR(MAX)')
ORDER BY (SELECT 1)) AS Position,
CAST(
SPACE(2 * p.lvl - 1) + N'|' + REPLICATE(N'-', 1)
+ c.value('local-name(.)','NVARCHAR(MAX)') AS NVARCHAR(MAX)
) AS Tree,
CAST( c.value('text()[1]','NVARCHAR(MAX)') AS NVARCHAR(MAX) ) AS Value,
c.query('.') AS this,
c.query('*') AS t,
CAST(
p.Sort
+ CAST( (lvl + 1) * 1024
+ (ROW_NUMBER() OVER(ORDER BY (SELECT 1)) * 2) AS VARBINARY(4)
) AS VARBINARY(MAX) ) AS Sort,
CAST(
(lvl + 1) * 1024
+ (ROW_NUMBER() OVER(ORDER BY (SELECT 1)) * 2) AS INT
)
FROM cte p
CROSS APPLY p.t.nodes('*') b(c)
), cte2 AS (
SELECT
lvl AS Depth,
Name AS NodeName,
ParentName,
ParentPosition,
NodeType,
FullPath,
XPath,
Position,
Tree AS TreeView,
Value,
this AS XMLData,
Sort, ID
FROM cte
UNION ALL
/*------------------------------------------------------------------
Attributes do not need recursive calls. So add the attributes
to the query output at the end.
------------------------------------------------------------------*/
SELECT
p.lvl,
x.value('local-name(.)','NVARCHAR(MAX)'),
p.Name,
p.Position,
CAST(N'Attribute' AS NVARCHAR(20)),
p.FullPath + N'/@' + x.value('local-name(.)','NVARCHAR(MAX)'),
p.XPath + N'/@' + x.value('local-name(.)','NVARCHAR(MAX)'),
1,
SPACE(2 * p.lvl - 1) + N'|' + REPLICATE('-', 1)
+ N'@' + x.value('local-name(.)','NVARCHAR(MAX)'),
x.value('.','NVARCHAR(MAX)'),
NULL,
p.Sort,
p.ID + 1
FROM cte p
CROSS APPLY this.nodes('/*/@*') a(x)
)
SELECT
ROW_NUMBER() OVER(ORDER BY Sort, ID) AS ID,
ParentName, ParentPosition,Depth, NodeName, Position,
NodeType, FullPath, XPath, TreeView, Value, XMLData
FROM cte2;
go
SELECT * FROM dbo.XMLTable('
<employees>
<emp name="jacob"/>
<emp name="steve">
<phone>123</phone>
some text
</emp>
</employees>
')
一个非常直接的方法可能是这样(而 @x
是你的 XML-执行计划):
DECLARE @x XML=
N'<root>
<ElementE1 AttributA1="A1-text belongs to E1[1]" OneMore="xyz">E1-Text 2</ElementE1>
<ElementE1 AttributA1="A1-text belongs to E1[2]">E1-Text 2</ElementE1>
<ElementParent>
<subElement test="sub"/>
Free text
</ElementParent>
</root>';
DECLARE @idoc INT;
EXEC sp_xml_preparedocument @idoc OUTPUT, @x;
SELECT * FROM OPENXML (@idoc, '*');
EXEC sp_xml_removedocument @idoc;
结果(不是所有列)
+----+----------+----------+--------------+------+--------------------------+
| id | parentid | nodetype | localname | prev | text |
+----+----------+----------+--------------+------+--------------------------+
| 0 | NULL | 1 | root | NULL | NULL |
+----+----------+----------+--------------+------+--------------------------+
| 2 | 0 | 1 | ElementE1 | NULL | NULL |
+----+----------+----------+--------------+------+--------------------------+
| 3 | 2 | 2 | AttributA1 | NULL | NULL |
+----+----------+----------+--------------+------+--------------------------+
| 13 | 3 | 3 | #text | NULL | A1-text belongs to E1[1] |
+----+----------+----------+--------------+------+--------------------------+
| 4 | 2 | 2 | OneMore | NULL | NULL |
+----+----------+----------+--------------+------+--------------------------+
| 14 | 4 | 3 | #text | NULL | xyz |
+----+----------+----------+--------------+------+--------------------------+
| 5 | 2 | 3 | #text | NULL | E1-Text 2 |
+----+----------+----------+--------------+------+--------------------------+
| 6 | 0 | 1 | ElementE1 | 2 | NULL |
+----+----------+----------+--------------+------+--------------------------+
| 7 | 6 | 2 | AttributA1 | NULL | NULL |
+----+----------+----------+--------------+------+--------------------------+
| 15 | 7 | 3 | #text | NULL | A1-text belongs to E1[2] |
+----+----------+----------+--------------+------+--------------------------+
| 8 | 6 | 3 | #text | NULL | E1-Text 2 |
+----+----------+----------+--------------+------+--------------------------+
| 9 | 0 | 1 | ElementParent| 6 | NULL |
+----+----------+----------+--------------+------+--------------------------+
| 10 | 9 | 1 | subElement | NULL | NULL |
+----+----------+----------+--------------+------+--------------------------+
| 11 | 10 | 2 | test | NULL | NULL |
+----+----------+----------+--------------+------+--------------------------+
| 16 | 11 | 3 | #text | NULL | sub |
+----+----------+----------+--------------+------+--------------------------+
| 12 | 9 | 3 | #text | 10 | Free text |
+----+----------+----------+--------------+------+--------------------------+
id
清楚地表明,该算法是 广度优先 ,没有 id=1
(为什么)和 nodetype
允许区分元素、属性和(浮动)文本。 prev
列指向链中的一个兄弟。缺少的列与名称空间有关...
FROM OPENXML
的方法已经过时,但这是一种可能仍然非常有用的罕见情况...
您会得到一个包含 ID 和 ParentID 的列表,您可以使用递归 CTE 查询...这取决于您之后想用它做什么...