嵌套属性的 XSL 转换

XSL Transform of Nested Attributes

我手头的任务是从以下(简化的)示例 XML 中剥离文本和任何关联的字体属性到 FileMaker 数据库中。例子 XML:

<Font Id="Arial" Script="normal" Size="32" Underlined="no" Italic="no" Weight="normal">
    <Paragraph>
        <Text>This <Font Italic="yes">word</Font> is italic</Text>
        <Text>This entire line has no formatting</Text>
        <Text>This<Font Italic="yes">line</Font><Font Underlined="yes" Italic = "yes"> has multiple formats</Font></Text>
    </Paragraph>

    <Paragraph>
        <Text>This is the first line of the second paragraph and has no formatting</Text>
        <Text>This line also has no formatting</Text>
        <Text><Font Underlined="yes">This entire line is underlined</Font></Text>
    </Paragraph>
</Font>

如您所见,<Paragraph> 元素包含在一个 ` 节点中。 (我希望我指的是正确的这些部分)。当没有嵌套的字体属性,或者嵌套的字体属性包含整个文本数据。 我坚持的是如何处理文本数据中具有嵌套属性的文本行,例如第一段中的第一行和第一段中的第三行。

我要做的是捕获每个数据片段及其属性。我的架构允许每行文本最多包含三个嵌套的字体属性 (a、b、c)。使用示例 XML 文件,我的 FileMaker 数据库对于第 1 段应该如下所示(简化):

Record 1
Line 1a Text: This
Line 1a Italic: (no value)
Line 1a Underlined: (no value)

Line 1b Text: word
Line 1b Italic: yes
Line 1b Underlined: (no value)

Line 1c Text: is italic
Line 1c Italic: (no value)
Line 1c Underlined: (no value)

Line 2a Text: This entire line has no formatting
Line 2a Italic: (no value)
Line 2a Underlined: (no value)

Line 2b Text: (no value)
Line 2b Italic:  (no value)
Line 2b Underlined: (no value)

Line 2c Text: (no value)
Line 2c Italic: (no value)
Line 2c Underlined: (no value)

Line 3a Text: This
Line 3a Italic: (no value)
Line 3a Underlined: (no value)

Line 3b Text: line
Line 3b Italic: yes
Line 3b Underlined: (no value)

Line 3c Text: has multiple formats
Line 3c Italic: yes
Line 3c Underlined: yes

当然,我无法预测何时何地应用格式。我希望我已经清楚了,并且非常感谢您提供的任何指示来帮助我完成这项任务。

我建议你尝试这样的事情,至少作为你的起点:

XSLT

<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>

<xsl:template match="/Font">
    <FMPXMLRESULT xmlns="http://www.filemaker.com/fmpxmlresult">
        <METADATA>
            <FIELD NAME="Text"/>
            <FIELD NAME="IsItalic" TYPE="NUMBER"/>
            <FIELD NAME="IsUnderline" TYPE="TEXT"/>
            <FIELD NAME="Paragraph" TYPE="TEXT"/>
        </METADATA>
        <RESULTSET>
            <!-- create a record for each text node, descendant of Paragraph -->
            <xsl:for-each select="Paragraph//text()">
                <ROW>
                    <!-- get the value of the current text node itself  -->
                    <COL><DATA><xsl:value-of select="."/></DATA></COL>
                    <!-- get the value of @Italic from the nearest ancestor that has such attribute -->
                    <COL><DATA><xsl:value-of select="ancestor::*[@Italic][1]/@Italic"/></DATA></COL>
                    <!-- get the value of @Underlined from the nearest ancestor that has such attribute -->
                    <COL><DATA><xsl:value-of select="ancestor::*[@Underlined][1]/@Underlined"/></DATA></COL>
                    <!-- get the ID of the ancestor Paragraph -->
                    <COL><DATA><xsl:value-of select="generate-id(ancestor::Paragraph)"/></DATA></COL>
                </ROW>
            </xsl:for-each>
        </RESULTSET>
    </FMPXMLRESULT>
</xsl:template>

</xsl:stylesheet>

应用于您的输入示例,您将得到:

请注意,段落 ID 仅在当前转换范围内是唯一的,并非普遍如此。


已添加:

根据我的判断,这里有一个样式表,将为每个 Paragraph 创建一条记录,每条记录都是 3 行乘 3 个文本节点的严格网格。

<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns="http://www.filemaker.com/fmpxmlresult">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>

<xsl:template match="/Font">
    <FMPXMLRESULT>
        <METADATA>
            <FIELD NAME="Line 1a Text"/>
            <FIELD NAME="Line 1a Italic"/>
            <FIELD NAME="Line 1a Underlined"/>

            <FIELD NAME="Line 1b Text"/>
            <FIELD NAME="Line 1b Italic"/>
            <FIELD NAME="Line 1b Underlined"/>

            <FIELD NAME="Line 1c Text"/>
            <FIELD NAME="Line 1c Italic"/>
            <FIELD NAME="Line 1c Underlined"/>

            <FIELD NAME="Line 2a Text"/>
            <FIELD NAME="Line 2a Italic"/>
            <FIELD NAME="Line 2a Underlined"/>
            <FIELD NAME="Line 2b Text"/>
            <FIELD NAME="Line 2b Italic"/>
            <FIELD NAME="Line 2b Underlined"/>

            <FIELD NAME="Line 2c Text"/>
            <FIELD NAME="Line 2c Italic"/>
            <FIELD NAME="Line 2c Underlined"/>

            <FIELD NAME="Line 3a Text"/>
            <FIELD NAME="Line 3a Italic"/>
            <FIELD NAME="Line 3a Underlined"/>

            <FIELD NAME="Line 3b Text"/>
            <FIELD NAME="Line 3b Italic"/>
            <FIELD NAME="Line 3b Underlined"/>

            <FIELD NAME="Line 3c Text"/>
            <FIELD NAME="Line 3c Italic"/>
            <FIELD NAME="Line 3c Underlined"/>
        </METADATA>
        <RESULTSET>
            <!-- create a record for each Paragraph -->
            <xsl:for-each select="Paragraph">
                <ROW>
                    <!-- for each line ...  -->
                    <xsl:for-each select="Text">
                        <xsl:variable name="text-nodes" select=".//text()" />
                        <!-- process the first three text nodes  -->
                        <xsl:call-template name="create-cells">
                            <xsl:with-param name="text-node" select="$text-nodes[1]"/>
                        </xsl:call-template>
                        <xsl:call-template name="create-cells">
                            <xsl:with-param name="text-node" select="$text-nodes[2]"/>
                        </xsl:call-template>
                        <xsl:call-template name="create-cells">
                            <xsl:with-param name="text-node" select="$text-nodes[3]"/>
                        </xsl:call-template>
                    </xsl:for-each> 
                </ROW>
            </xsl:for-each>
        </RESULTSET>
    </FMPXMLRESULT>
</xsl:template>

<xsl:template name="create-cells">
    <xsl:param name="text-node"/>
    <!-- get the value of the text node itself  -->
    <COL><DATA><xsl:value-of select="$text-node"/></DATA></COL>
    <!-- get the value of @Italic from the nearest ancestor that has such attribute -->
    <COL><DATA><xsl:value-of select="$text-node/ancestor::*[@Italic][1]/@Italic"/></DATA></COL>
    <!-- get the value of @Underlined from the nearest ancestor that has such attribute -->
    <COL><DATA><xsl:value-of select="$text-node/ancestor::*[@Underlined][1]/@Underlined"/></DATA></COL>
</xsl:template>

</xsl:stylesheet>

结果将如下所示(列表视图中显示两条记录):