XSLT1.0 在转换 XML 数据时从特定标签复制节点内容

XSLT1.0 copy node content from specific tags when transforming XML data

我有一个 XSLT 转换文件,可以将 XML 文件转换为另一种格式。源 XML 有许多与目标样式表不兼容的格式标记。然后,我需要读取一些标签的内容,传递元素内容。

XSLT1.0 转换代码如下:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" exclude-result-prefixes="xs">
    <xsl:template match="/">
        <xsl:variable name="var1_initial" select="."/>
        <xsl:for-each select="procstep">
            <procl>
                <xsl:variable name="var11_cur" select="."/>
                <procstep>
                    <xsl:attribute name="time">1</xsl:attribute>
                    <title>
                        <xsl:for-each select="(./proct/node())[./self::text()]">
                            <xsl:variable name="var12_filter" select="."/>
                            <xsl:value-of select="normalize-space(string(.))"/>
                            <xsl:text> </xsl:text>
                        </xsl:for-each>
                    </title>
                </procstep>
            </procl>
        </xsl:for-each>
    </xsl:template>
</xsl:stylesheet>

这是一个示例源数据:

<procl>
    <procstep>
        <proct>Connect the lifts together.</br>Lift the vehicle.</proct>
    </procstep>
    <procstep>
        <proct>Remove the screws.</br>Remove the plates.</proct>
    </procstep>
    <procstep>
        <proct>Remove the nuts and washers.</br>Remove the shield.</proct>
    </procstep>
    <procstep>
        <proct>Secure the exhaust pipe.</br>Install a strap.</br>Apply torque of <hp1>25 Nm</hp1>.</proct>
    </procstep>
    <procstep>
        <proct>Install the screws and nuts.</br>Use tool <hp2>256256</hp2> to fix the clamp.</proct>
    </procstep>
    <procstep>
        <proct>Install the nuts and screws.</br>Assemble the member in the following order:
            <table>
                <tgroup cols="2" colsep="1" rowsep="1">
                    <colspec colwidth="132.38*"/>
                    <colspec colwidth="132.10*"/>
                    <thead>
                        <row>
                            <entry align="left" valign="top">Value</entry>
                            <entry align="left" valign="top">Position</entry>
                        </row>
                    </thead>
                    <tbody>
                        <row>
                            <entry align="left" valign="top">25</entry>
                            <entry align="left" valign="top">Superior</entry>
                        </row>
                        <row>
                            <entry align="left" valign="top">12</entry>
                            <entry align="left" valign="top">Inferior</entry>
                        </row>
                    </tbody>
                </tgroup>
            </table>
        </proct>
    </procstep>
    <procstep>
        <proct>Lower the vehicle.</proct>
    </procstep>
    <procstep>
        <proct>Mark the <hp1>torque value</hp1> in the data sheet.</proct>
    </procstep>
</procl>

我将 <hp0><hp1><hp2><hp3> 作为格式标记(粗体、斜体、下划线和强调)。标签</br>是刹车线,会被normalize-space选项去掉。

目标代码必须保留 <hp*></hp*> 内的元素,但必须删除标签。 我尝试添加所有内容,但目标样式表不允许 <hp*> 标记。它不允许其他标签,例如 table 内容。使用 XPath 会包含我需要忽略的内容。

结果代码必须是:

<procl>
    <procstep>
        <title>Connect the lifts together. Lift the vehicle. </title>
    </procstep>
    <procstep>
        <title>Remove the screws. Remove the plates. </title>
    </procstep>
    <procstep>
        <title>Remove the nuts and washers. Remove the shield. </title>
    </procstep>
    <procstep>
        <title>Secure the exhaust pipe. Install a strap. Apply torque of 25 Nm. </title>
    </procstep>
    <procstep>
        <title>Install the screws and nuts. Use tool 256256 to fix the clamp. </title>
    </procstep>
    <procstep>
        <title>Install the nuts and screws. Assemble the member in the following order: </title>
    </procstep>
    <procstep>
        <title>Lower the vehicle. </title>
    </procstep>
    <procstep>
        <title>Mark the torque value in the data sheet. </title>
    </procstep>
</procl>

我在带有 LET.XSLT 库的 Python 应用程序中使用 XSLT,我只能使用 XSLT1.0。我的 XSLT 代码和源数据更复杂。我在这里尝试简化以仅关注 <proct> 数据的转换。

所以,问题是:如何将<hp*></hp*>内的conten传递给<proct></proct>下的另一个子节点? 也许这个问题对你来说很简单,但我是 XSLT 转换的新手。

提前感谢您的宝贵时间。

输入 XML 格式不正确。我必须修复它。

输入XML

<?xml version="1.0"?>
<procl>
    <procstep>
        <proct>Connect the lifts together.<br/>Lift the vehicle.</proct>
    </procstep>
    <procstep>
        <proct>Remove the screws.<br/>Remove the plates.</proct>
    </procstep>
    <procstep>
        <proct>Remove the nuts and washers.<br/>Remove the shield.</proct>
    </procstep>
    <procstep>
        <proct>Secure the exhaust pipe.<br/>Install a strap.<br/>Apply torque of <hp1>25 Nm</hp1>.</proct>
    </procstep>
    <procstep>
        <proct>Install the screws and nuts.<br/>Use tool <hp2>256256</hp2> to fix the clamp.</proct>
    </procstep>
    <procstep>
        <proct>Install the nuts and screws.<br/>Assemble the member in the following order:
            <table>
                <tgroup cols="2" colsep="1" rowsep="1">
                    <colspec colwidth="132.38*"/>
                    <colspec colwidth="132.10*"/>
                    <thead>
                        <row>
                            <entry align="left" valign="top">Value</entry>
                            <entry align="left" valign="top">Position</entry>
                        </row>
                    </thead>
                    <tbody>
                        <row>
                            <entry align="left" valign="top">25</entry>
                            <entry align="left" valign="top">Superior</entry>
                        </row>
                        <row>
                            <entry align="left" valign="top">12</entry>
                            <entry align="left" valign="top">Inferior</entry>
                        </row>
                    </tbody>
                </tgroup>
            </table>
        </proct>
    </procstep>
    <procstep>
        <proct>Lower the vehicle.</proct>
    </procstep>
    <procstep>
        <proct>Mark the <hp1>torque value</hp1> in the data sheet.</proct>
    </procstep>
</procl>

XSLT

<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="xml" encoding="utf-8" indent="yes" omit-xml-declaration="yes"/>
    <xsl:strip-space elements="*"/>

    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="proct">
        <title>
            <xsl:apply-templates select="@*|node()"/>
        </title>
    </xsl:template>

    <xsl:template match="text()">
        <xsl:value-of select="normalize-space(.)"/>
    </xsl:template>

    <xsl:template match="hp1|hp2">
        <xsl:text> </xsl:text>
        <xsl:value-of select="."/>
        <xsl:text> </xsl:text>
    </xsl:template>

    <xsl:template match="table"/>
    <xsl:template match="br"/>
</xsl:stylesheet>

输出XML

<procl>
  <procstep>
    <title>Connect the lifts together.Lift the vehicle.</title>
  </procstep>
  <procstep>
    <title>Remove the screws.Remove the plates.</title>
  </procstep>
  <procstep>
    <title>Remove the nuts and washers.Remove the shield.</title>
  </procstep>
  <procstep>
    <title>Secure the exhaust pipe.Install a strap.Apply torque of 25 Nm .</title>
  </procstep>
  <procstep>
    <title>Install the screws and nuts.Use tool 256256 to fix the clamp.</title>
  </procstep>
  <procstep>
    <title>Install the nuts and screws.Assemble the member in the following order:</title>
  </procstep>
  <procstep>
    <title>Lower the vehicle.</title>
  </procstep>
  <procstep>
    <title>Mark the torque value in the data sheet.</title>
  </procstep>
</procl>

除了一件事,你的输出可以很简单地只用这个来产生:

XSLT 1.0

<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>

<xsl:template match="/procl | procstep">
    <xsl:copy>
        <xsl:apply-templates/>
    </xsl:copy>
</xsl:template>

<xsl:template match="proct">
    <title>
        <xsl:apply-templates/>
    </title>
</xsl:template>

<xsl:template match="br">
    <xsl:text> </xsl:text>
</xsl:template>

<xsl:template match="*[not(starts-with(name(), 'hp'))]" priority="-1"/>

</xsl:stylesheet>

与实际结果的唯一区别:

<?xml version="1.0" encoding="UTF-8"?>
<procl>
   <procstep>
      <title>Connect the lifts together. Lift the vehicle.</title>
   </procstep>
   <procstep>
      <title>Remove the screws. Remove the plates.</title>
   </procstep>
   <procstep>
      <title>Remove the nuts and washers. Remove the shield.</title>
   </procstep>
   <procstep>
      <title>Secure the exhaust pipe. Install a strap. Apply torque of 25 Nm.</title>
   </procstep>
   <procstep>
      <title>Install the screws and nuts. Use tool 256256 to fix the clamp.</title>
   </procstep>
   <procstep>
      <title>Install the nuts and screws. Assemble the member in the following order:
            </title>
   </procstep>
   <procstep>
      <title>Lower the vehicle.</title>
   </procstep>
   <procstep>
      <title>Mark the torque value in the data sheet.</title>
   </procstep>
</procl>

并且预期的输出是 "Assemble the member in the following order:" 之后的额外白色 space。

您可以尝试使用 normalize-space() 删除它 - 但如果您将它全局应用于传递给输出的所有文本节点(如上所示),您也会破坏现有的 spaces可能存在于 hp* 格式化元素周围。尝试任意恢复这些可能会导致与原始文本不同的结果 - 例如当只有一个单词的一部分被格式化时。恕我直言,如果有一种方法可以识别此类有问题的文本节点(例如紧随其后的 table 兄弟节点),那将是更可取的。或者,您可以添加另一轮处理并将 normalize-space() 应用于第一轮创建的 title 元素。