XSLT 到 select 并转换节点(使用正则表达式匹配)和后续兄弟节点直到下一个相似节点
XSLT to select and transform node (with regex match) and following siblings until next similar node
有点简化,我的 XML 看起来像这样:
<?xml version="1.0" encoding="UTF-8"?>
<span style="bold">1.</span>
<def>this is a definition in the first sense.</def> – <cit type="example">
<quote>This is a <span style="bold">quote</span> for the first sense. </quote>
<span style="bold">2.</span>
<def>This is a definition for the second sense</def> – <cit type="example">
<quote>This is a quote for the second sense.</quote>
我需要使用 XSLT 2.0 或 3.0 对此进行转换以获得以下内容:
<?xml version="1.0" encoding="UTF-8"?>
<sense n="1">
<def>this is a definition in the first sense.</def> – <cit type="example">
<quote>This is a <span style="bold">quote</span> for the first sense. </quote>
<sense n="2">
<def>This is a definition for the second sense</def> – <cit type="example">
<quote>This is a quote for the second sense.</quote>
T这里可以有两种以上的意思,span style bold 也可以出现在其他地方,所以我们需要专门识别类似tei:span[@style='bold'][matches(text(), '^\d\.')]
我很难将它放在一个样式表中,该样式表还会提取跨度文本节点的数字并将其用作新元素的属性值 <sense>
我将非常感谢你 tips.x
这是一个 XSLT 3.0 示例
<xsl:stylesheet version="3.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" exclude-result-prefixes="xs">
<xsl:mode on-no-match="shallow-copy"/>
<xsl:output indent="yes"/>
<xsl:template match="entry">
<xsl:for-each-group select="node()" group-starting-with="span[@style = 'bold'][matches(., '^[0-9]+\.$')]">
<xsl:when test="self::span[@style = 'bold'][matches(., '^[0-9]+\.$')]">
<sense nr="{replace(., '[^0-9]+', '')}">
<xsl:apply-templates select="current-group() except ."/>
<xsl:apply-templates select="current-group()"/>
<?xml version="1.0" encoding="UTF-8"?>
<sense nr="1">
<def>this is a definition in the first sense.</def> – <cit type="example">
<quote>This is a <span style="bold">quote</span> for the first sense. </quote>
<sense nr="2">
<def>This is a definition for the second sense</def> – <cit type="example">
<quote>This is a quote for the second sense.</quote>
有点简化,我的 XML 看起来像这样:
<?xml version="1.0" encoding="UTF-8"?>
<span style="bold">1.</span>
<def>this is a definition in the first sense.</def> – <cit type="example">
<quote>This is a <span style="bold">quote</span> for the first sense. </quote>
<span style="bold">2.</span>
<def>This is a definition for the second sense</def> – <cit type="example">
<quote>This is a quote for the second sense.</quote>
我需要使用 XSLT 2.0 或 3.0 对此进行转换以获得以下内容:
<?xml version="1.0" encoding="UTF-8"?>
<sense n="1">
<def>this is a definition in the first sense.</def> – <cit type="example">
<quote>This is a <span style="bold">quote</span> for the first sense. </quote>
<sense n="2">
<def>This is a definition for the second sense</def> – <cit type="example">
<quote>This is a quote for the second sense.</quote>
T这里可以有两种以上的意思,span style bold 也可以出现在其他地方,所以我们需要专门识别类似tei:span[@style='bold'][matches(text(), '^\d\.')]
我很难将它放在一个样式表中,该样式表还会提取跨度文本节点的数字并将其用作新元素的属性值 <sense>
我将非常感谢你 tips.x
这是一个 XSLT 3.0 示例
<xsl:stylesheet version="3.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" exclude-result-prefixes="xs">
<xsl:mode on-no-match="shallow-copy"/>
<xsl:output indent="yes"/>
<xsl:template match="entry">
<xsl:for-each-group select="node()" group-starting-with="span[@style = 'bold'][matches(., '^[0-9]+\.$')]">
<xsl:when test="self::span[@style = 'bold'][matches(., '^[0-9]+\.$')]">
<sense nr="{replace(., '[^0-9]+', '')}">
<xsl:apply-templates select="current-group() except ."/>
<xsl:apply-templates select="current-group()"/>
<?xml version="1.0" encoding="UTF-8"?>
<sense nr="1">
<def>this is a definition in the first sense.</def> – <cit type="example">
<quote>This is a <span style="bold">quote</span> for the first sense. </quote>
<sense nr="2">
<def>This is a definition for the second sense</def> – <cit type="example">
<quote>This is a quote for the second sense.</quote>