在自闭合元素之间插入元素
Insert element between self-closing elements
在 XML 文档中,我打算在两个自闭合元素之间插入元素。考虑以下示例:
<body>
<p>Lorem ipsum dolor sit amet,
<lb/>consectetur adipisici elit,
<lb/>sed eiusmod tempor incidunt
<lb/>ut labore et dolore magna aliqua.
</p>
<p>Ut enim ad minim veniam,
<lb/>quis nostrud exercitation ullamco
<lb/>laboris nisi ut aliquid
<lb/>ex ea commodi consequat.
</p>
</body>
于是就有了段落(p)和换行(lb)这样的结构。我现在的目标是将线条放入元素中。所以我想实现以下转换结果(或类似结果):
<body>
<p>
<l>Lorem ipsum dolor sit amet,</l>
<l>consectetur adipisici elit,</l>
<l>sed eiusmod tempor incidunt</l>
<l>ut labore et dolore magna aliqua.</l>
</p>
<p>
<l>Ut enim ad minim veniam,</l>
<l>quis nostrud exercitation ullamco</l>
<l>laboris nisi ut aliquid</l>
<l>ex ea commodi consequat.</l>
</p>
</body>
这真的可以用 XSLT 实现吗?这似乎不是一个典型的应用程序,因为我还没有找到一种方法。如有任何帮助,我将不胜感激。
编辑:
这是该问题的一个更复杂的变体,它增加了:
(1) 突出显示、重叠的段落 (hi) 和
(2)重叠的"choice"元素,其中只需要保留"sic"元素
<body>
<p>Lorem ipsum dolor sit amet,
<lb/>consectetur adipisici elit,
<lb/>sed eiusmod tempor <hi>incidunt
<lb/>ut labore</hi> et dolore magna aliqua.
</p>
<p>Ut enim ad minim <choice>
<sic>venima, <lb/>quis noster</sic>
<corr>veniam, quis nostrud</corr>
</choice> exercitation ullamco
<lb/>laboris nisi ut aliquid
<lb/>ex ea commodi consequat.
</p>
</body>
例如,所需的输出是
(1) 行号,以及
(2) @cont 属性,指示拆分元素的延续。
<body>
<p>
<l n="1">Lorem ipsum dolor sit amet,</l>
<l n="2">consectetur adipisici elit,</l>
<l n="3">sed eiusmod tempor <hi cont="true">incidunt</hi></l>
<l n="4"><hi cont="false">ut labore</hi> et dolore magna aliqua.</l>
</p>
<p>
<l n="5">Ut enim ad minim <sic cont="true">venima,</sic></l>
<l n="6"><sic cont="false">quis noster</sic> exercitation ullamco</l>
<l n="7">laboris nisi ut aliquid</l>
<l n="8">ex ea commodi consequat.</l>
</p>
</body>
这几乎涵盖了我遇到的最坏情况。感谢您的帮助!
如果行与行之间总是有一个 lb
元素,则可以执行以下操作,因为由子元素分隔的文本内容最终会出现在单独的文本节点中。
XSLT 样式表
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="body">
<xsl:copy>
<xsl:apply-templates/>
</xsl:copy>
</xsl:template>
<xsl:template match="p">
<xsl:copy>
<xsl:for-each select="text()">
<l>
<xsl:value-of select="normalize-space(.)"/>
</l>
</xsl:for-each>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
XML输出
<?xml version="1.0" encoding="UTF-8"?>
<body>
<p>
<l>Lorem ipsum dolor sit amet,</l>
<l>consectetur adipisici elit,</l>
<l>sed eiusmod tempor incidunt</l>
<l>ut labore et dolore magna aliqua.</l>
</p>
<p>
<l>Ut enim ad minim veniam,</l>
<l>quis nostrud exercitation ullamco</l>
<l>laboris nisi ut aliquid</l>
<l>ex ea commodi consequat.</l>
</p>
</body>
Here comes a more complex variant of the problem, which adds: (1)
highlighted, overlapping passages (hi) and (2) an overlapping "choice"
element, of which only the "sic" element needs to be preserved.
好吧,这很有趣 - 尽管对于单个 SO 问题来说可能太多了。无论如何,我找不到同时完成所有这些的方法。以下样式表使用两次传递 return 结果 几乎 您要求的结果:
XSLT 2.0
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:variable name="first-pass">
<xsl:apply-templates select="/*" mode="first-pass"/>
</xsl:variable>
<!-- identity transform -->
<xsl:template match="@*|node()" mode="#all">
<xsl:copy>
<xsl:apply-templates select="@*|node()" mode="#current"/>
</xsl:copy>
</xsl:template>
<!-- FIRST-PASS TEMPLATES -->
<!-- remove choice wrapper, preserve only sic content -->
<xsl:template match="choice" mode="first-pass">
<xsl:apply-templates select="sic" mode="first-pass"/>
</xsl:template>
<!-- split hi and sic accross lb -->
<xsl:template match="hi | sic" mode="first-pass">
<xsl:variable name="elem-name" select="local-name()" />
<xsl:for-each-group select="text()" group-by="generate-id(preceding-sibling::lb[1])">
<xsl:element name="{$elem-name}">
<xsl:attribute name="cont" select="position()!=last()"/>
<xsl:apply-templates select="current-group()" mode="first-pass"/>
</xsl:element>
<xsl:if test="position()!=last()">
<lb/>
</xsl:if>
</xsl:for-each-group>
</xsl:template>
<!-- OUTPUT -->
<xsl:template match="/">
<xsl:apply-templates select="$first-pass/*"/>
</xsl:template>
<!-- create a line element for each group separated by lb -->
<xsl:template match="*[lb]">
<xsl:copy>
<xsl:for-each-group select="node()" group-ending-with="lb">
<l n="{position()}">
<xsl:apply-templates select="current-group()"/>
</l>
</xsl:for-each-group>
</xsl:copy>
</xsl:template>
<!-- suppress lb -->
<xsl:template match="lb"/>
</xsl:stylesheet>
结果:
<?xml version="1.0" encoding="UTF-8"?>
<body>
<p>
<l n="1">Lorem ipsum dolor sit amet,
</l>
<l n="2">consectetur adipisici elit,
</l>
<l n="3">sed eiusmod tempor <hi cont="true">incidunt
</hi>
</l>
<l n="4">
<hi cont="false">ut labore</hi> et dolore magna aliqua.
</l>
</p>
<p>
<l n="1">Ut enim ad minim <sic cont="true">venima, </sic>
</l>
<l n="2">
<sic cont="false">quis noster</sic> exercitation ullamco
</l>
<l n="3">laboris nisi ut aliquid
</l>
<l n="4">ex ea commodi consequat.
</l>
</p>
</body>
在 XML 文档中,我打算在两个自闭合元素之间插入元素。考虑以下示例:
<body>
<p>Lorem ipsum dolor sit amet,
<lb/>consectetur adipisici elit,
<lb/>sed eiusmod tempor incidunt
<lb/>ut labore et dolore magna aliqua.
</p>
<p>Ut enim ad minim veniam,
<lb/>quis nostrud exercitation ullamco
<lb/>laboris nisi ut aliquid
<lb/>ex ea commodi consequat.
</p>
</body>
于是就有了段落(p)和换行(lb)这样的结构。我现在的目标是将线条放入元素中。所以我想实现以下转换结果(或类似结果):
<body>
<p>
<l>Lorem ipsum dolor sit amet,</l>
<l>consectetur adipisici elit,</l>
<l>sed eiusmod tempor incidunt</l>
<l>ut labore et dolore magna aliqua.</l>
</p>
<p>
<l>Ut enim ad minim veniam,</l>
<l>quis nostrud exercitation ullamco</l>
<l>laboris nisi ut aliquid</l>
<l>ex ea commodi consequat.</l>
</p>
</body>
这真的可以用 XSLT 实现吗?这似乎不是一个典型的应用程序,因为我还没有找到一种方法。如有任何帮助,我将不胜感激。
编辑: 这是该问题的一个更复杂的变体,它增加了: (1) 突出显示、重叠的段落 (hi) 和 (2)重叠的"choice"元素,其中只需要保留"sic"元素
<body>
<p>Lorem ipsum dolor sit amet,
<lb/>consectetur adipisici elit,
<lb/>sed eiusmod tempor <hi>incidunt
<lb/>ut labore</hi> et dolore magna aliqua.
</p>
<p>Ut enim ad minim <choice>
<sic>venima, <lb/>quis noster</sic>
<corr>veniam, quis nostrud</corr>
</choice> exercitation ullamco
<lb/>laboris nisi ut aliquid
<lb/>ex ea commodi consequat.
</p>
</body>
例如,所需的输出是 (1) 行号,以及 (2) @cont 属性,指示拆分元素的延续。
<body>
<p>
<l n="1">Lorem ipsum dolor sit amet,</l>
<l n="2">consectetur adipisici elit,</l>
<l n="3">sed eiusmod tempor <hi cont="true">incidunt</hi></l>
<l n="4"><hi cont="false">ut labore</hi> et dolore magna aliqua.</l>
</p>
<p>
<l n="5">Ut enim ad minim <sic cont="true">venima,</sic></l>
<l n="6"><sic cont="false">quis noster</sic> exercitation ullamco</l>
<l n="7">laboris nisi ut aliquid</l>
<l n="8">ex ea commodi consequat.</l>
</p>
</body>
这几乎涵盖了我遇到的最坏情况。感谢您的帮助!
如果行与行之间总是有一个 lb
元素,则可以执行以下操作,因为由子元素分隔的文本内容最终会出现在单独的文本节点中。
XSLT 样式表
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="body">
<xsl:copy>
<xsl:apply-templates/>
</xsl:copy>
</xsl:template>
<xsl:template match="p">
<xsl:copy>
<xsl:for-each select="text()">
<l>
<xsl:value-of select="normalize-space(.)"/>
</l>
</xsl:for-each>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
XML输出
<?xml version="1.0" encoding="UTF-8"?>
<body>
<p>
<l>Lorem ipsum dolor sit amet,</l>
<l>consectetur adipisici elit,</l>
<l>sed eiusmod tempor incidunt</l>
<l>ut labore et dolore magna aliqua.</l>
</p>
<p>
<l>Ut enim ad minim veniam,</l>
<l>quis nostrud exercitation ullamco</l>
<l>laboris nisi ut aliquid</l>
<l>ex ea commodi consequat.</l>
</p>
</body>
Here comes a more complex variant of the problem, which adds: (1) highlighted, overlapping passages (hi) and (2) an overlapping "choice" element, of which only the "sic" element needs to be preserved.
好吧,这很有趣 - 尽管对于单个 SO 问题来说可能太多了。无论如何,我找不到同时完成所有这些的方法。以下样式表使用两次传递 return 结果 几乎 您要求的结果:
XSLT 2.0
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:variable name="first-pass">
<xsl:apply-templates select="/*" mode="first-pass"/>
</xsl:variable>
<!-- identity transform -->
<xsl:template match="@*|node()" mode="#all">
<xsl:copy>
<xsl:apply-templates select="@*|node()" mode="#current"/>
</xsl:copy>
</xsl:template>
<!-- FIRST-PASS TEMPLATES -->
<!-- remove choice wrapper, preserve only sic content -->
<xsl:template match="choice" mode="first-pass">
<xsl:apply-templates select="sic" mode="first-pass"/>
</xsl:template>
<!-- split hi and sic accross lb -->
<xsl:template match="hi | sic" mode="first-pass">
<xsl:variable name="elem-name" select="local-name()" />
<xsl:for-each-group select="text()" group-by="generate-id(preceding-sibling::lb[1])">
<xsl:element name="{$elem-name}">
<xsl:attribute name="cont" select="position()!=last()"/>
<xsl:apply-templates select="current-group()" mode="first-pass"/>
</xsl:element>
<xsl:if test="position()!=last()">
<lb/>
</xsl:if>
</xsl:for-each-group>
</xsl:template>
<!-- OUTPUT -->
<xsl:template match="/">
<xsl:apply-templates select="$first-pass/*"/>
</xsl:template>
<!-- create a line element for each group separated by lb -->
<xsl:template match="*[lb]">
<xsl:copy>
<xsl:for-each-group select="node()" group-ending-with="lb">
<l n="{position()}">
<xsl:apply-templates select="current-group()"/>
</l>
</xsl:for-each-group>
</xsl:copy>
</xsl:template>
<!-- suppress lb -->
<xsl:template match="lb"/>
</xsl:stylesheet>
结果:
<?xml version="1.0" encoding="UTF-8"?>
<body>
<p>
<l n="1">Lorem ipsum dolor sit amet,
</l>
<l n="2">consectetur adipisici elit,
</l>
<l n="3">sed eiusmod tempor <hi cont="true">incidunt
</hi>
</l>
<l n="4">
<hi cont="false">ut labore</hi> et dolore magna aliqua.
</l>
</p>
<p>
<l n="1">Ut enim ad minim <sic cont="true">venima, </sic>
</l>
<l n="2">
<sic cont="false">quis noster</sic> exercitation ullamco
</l>
<l n="3">laboris nisi ut aliquid
</l>
<l n="4">ex ea commodi consequat.
</l>
</p>
</body>