XSLT 中的依赖图遍历，用于复制 XML 模型的相关元素

Question

我想通过解决以下问题来展示 XSL 在数据探索方面的强大功能：给定一个描述某种 "entity-relashionship" 模型的 xml 文件，并且对于该模型中由名称给出的一个实体（假设 XML 模式的属性用作标识符），我想要一个生成新 XML 模型的转换，该模型包含给定实体，以及根据该给定实体的 "Transitive closure of the dependencies relationship" 的所有亲属。

例如输入XML模型为

<root>
    <!-- my model is made of 3 entities : leaf, composite and object -->
    <!-- the xml elements are <leaves>, <composites> and <objects> are just placeholders for these entities -->
    <!-- These placeholders are exepected to be in that order in the output as well as in the input (Schema constraints) -->
    <leaves>
        <!-- A, B, C are 3 types of different leaf nodes with their proper semantic in the model -->
        <A name="f1" others="oooo"/>
        <A name="f2" others="xxxx"/>
        <B name="f3" others="ssss"/>
        <C name="f4" others="gggg"/>    
    </leaves>
    <composites>
        <!-- composites containes only struct and union element -->
        <struct name="structB" others="yyyy">
            <!-- composite pattern, struct can embed struct in a tree-ish fashion -->
            <sRef name="s6" nameRef="structA"/>
            <!-- order of declaration does not matter !!! here in the XML, structA is not yet declared but file is valid -->
            <uRef name="u7" nameRef="unionX"/>
        </struct>
        <!-- union is another kind of composition -->
        <union name="unionX" others="rrrr">
            <vRef name="u3" nameRef="f3" others="jjjj">
            <vRef name="u4" nameRef="f2" others="pppp">
        </union>
        <struct name="structA" others="hhhh">
            <vRef name="v1" nameRef="f1" others="jjjj">
            <vRef name="v2" nameRef="f4" others="pppp">
        </struct>
    </composites>
    <objects>
        <object name="objB" others="tttt">
            <field name="field1" nameRef="unionX" others="qqqq"/>
            <field name="field2" nameRef="f2" others="cccc"/>
        </object>
        <object name="objC" others="nnnn">
            <field name="fieldX" nameRef="structB" others="uuuu"/>
            <field name="fieldY" nameRef="" others="mmmm"/>
        </object>
        <object name="objMain" others="nnnn">
            <field name="fieldY" nameRef="structA" others="mmmm"/>
            <field name="fieldY" nameRef="f3" others="mmmm"/>
            <field name="object4" nameRef="objB" others="wwwww"/>
        </object>
    </objects>
<root>

我想要一个转换，对于给定的名称，创建模型的副本，其中仅包含与此名称的元素相关的信息，以及 nameRef 属性描述的依赖项。

所以对于元素 "field1" 输出将是

<root>
    <leaves>
        <A name="f1" others="oooo"/>
    </leaves>
    <!-- composites and objects placeholders shall be copied even when no elements in the graph traversal -->
    <composites/>
    <objects/>
<root>

而对于 "objB"，预期输出将是

<root>
    <leaves>
        <!-- element "f2" shall be copied only once in the output, althought the node is encountered twice in the traversal of "objB" tree :
            - "f2" is referenced under "field2" of "obj2"
            - "f2" is referenced under "u4" of "unionX" that is referencd under "field1" of "obj2"      
        -->
        <A name="f2" others="xxxx"/>
        <B name="f3" others="ssss"/>
    </leaves>
    <composites>
        <union name="unionX" others="rrrr">
            <vRef name="u3" nameRef="f3" others="jjjj">
            <vRef name="u4" nameRef="f2" others="pppp">
        </union>
    <composites>
    <objects>
        <object name="objB" others="tttt">
            <field name="field1" nameRef="unionX" others="qqqq"/>
            <field name="field2" nameRef="f2" others="cccc"/>
        </object>
    </objects>
<root>

依此类推。

从现在开始，我在基本的 XSL 上锻炼但不是很满意，原因如下：

我的转换不是基于"identity rules"复制的基础
我的转换在遇到匹配实体时使用 xsl:copy-of，但这破坏了设计并违反了 XSD 架构
输出文件不符合输入的 XML 架构定义，主要是因为 xsl:copy-of 违反了 XML 元素的遍历
当一个实体在依赖关系的传递闭包中多次出现时，我的转换会在输出中生成重复实体

我只有一些感受和"intuitions"关于好的和优雅的方法。

从 "identity transformation" 模板开始以尊重输入的 Xml 架构
使用按键分组/排序
为它实现了某种 "Muenchian Method"（实际上不确定，也许只是为了 XSLT 1.0）

为简化起见，您可以做出以下假设：

他们是没有循环依赖的情况（可以实现tree walk）
nameRef / name 由 XSD 中的 "key" 交叉检查，以便输入
要搜索的元素的输入参数 "name" 存在于输入 xml 模型中（尽管生成 "empty" 有效 xml 会很好那种情况）

"empty" xml 输出模型应如下所示（由于模式限制）

<root>
    <leaves/>
    <composites/>
    <objects/>
<root>

待完成：我目前使用的 xslt 处理器是 Saxon XSLT proc，XSLT 的版本是 2.0 感谢您的帮助... 我不给你我不引以为豪的xsl，但如果它看起来有帮助，我会...

Answer 1

我试图在 https://xsltfiddle.liberty-development.net/gWEamLs/6 实施 "a transformation that,for a given name, creates a copy of the model with only information related to the element of this name, and of its dependencies described by the nameRef attributes"：

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    xmlns:mf="http://example.com/mf"
    exclude-result-prefixes="#all"
    version="3.0">

  <xsl:param name="start-name" as="xs:string">objB</xsl:param>

  <xsl:key name="name-ref" match="*[@name]" use="@name"/>

  <xsl:function name="mf:traverse" as="element()*">
      <xsl:param name="start" as="element()?"/>
      <xsl:sequence select="$start, $start/*, $start/*[@nameRef]!key('name-ref', @nameRef, root(.))!mf:traverse(.)"/>
  </xsl:function>

  <xsl:param name="start-element" as="element()?" select="key('name-ref', $start-name)"/>

  <xsl:variable name="named-elements" select="mf:traverse($start-element)"/>

  <xsl:mode on-no-match="shallow-copy"/>

  <xsl:template match="*[@name and not(. intersect $named-elements)]"/>

</xsl:stylesheet>

基于键和递归函数，代码 "first" 将相关元素计算为全局变量中的元素节点序列，并且 "then" 由 [=11 以声明方式设置的身份转换=] 只是通过一个空模板扩展那些具有 name 属性但递归函数未发现与起始元素相关的元素，确保任何不相关的元素不会被复制到输出。

XSLT 中的依赖图遍历，用于复制 XML 模型的相关元素

dependency graph traversal in XSLT for copying related elements of an XML model

xml

xslt

saxon

xslt-2.0

xslt-grouping