通过 XSL 简单展开 HTML 文件
Simple unflattening HTML file through XSL
我四处寻找通过 XSL 的非扁平化程序,但 none 其中确实对我有用,尽管我相信我的情况非常简单。我有一个 HTML 的集合,总是相同的结构,我想通过 XSL 转换来展开。基本上,它是关于将 <p class='subtitle'>
之后的所有元素封装到 <div>
元素中,直到下一个 <p class='subtitle'>
,并且——理想情况下! – 仍然对 dividually 中的元素应用转换,但这是可选的(见下文)。
源文件看起来像:
[...some stuff on the page]
<p class='header'>Some text</p>
<p class='subtitle'>Subtitle 1</p>
<p class='content'>First paragraph of part 1, with some <span>Inside</span> and other
nested elements, on multiple levels</p>
<ul>a list with <li> inside</ul>
<p class='content'>Second paragraph of part 1</p>
<img src='xyz.jpg'/>
<p class='content'>Third paragraph of part 1</p>
<p class='subtitle'>Subtitle 2</p>
<p class='content'>First paragraph of part 2</p>
<p class='content'>Second paragraph of part 2</p>
<p class='subtitle'>Subtitle 3
[and so on…]
我想把它变成:
<div n='section1'>
<head>Subtitle 1</head>
<p>First paragraph of part 1, with some <span>Inside</span> and other and other
nested elements, on multiple levels</p>
<ul>a list with <li> inside</ul>
<p>Second paragraph of part 1</p>
<picture source='xyz.jpg'/>
<p>Third paragraph of part 1</p>
</div>
<div n="section2">
<head>Subtitle 2</head>
<p>First paragraph of part 2</p>
<p>Second paragraph of part 2</p>
</div>
<div n="Section 3">
<head>Subtitle 3</head>
[and so on…]
我找不到解决这个问题的方法。此外,如果第一步只是展开 HTML 文件(严格复制 div 中的元素而不进行转换),这已经很了不起了。
提前致谢!
这是一个经典的位置分组问题。入门指南:
<xsl:template match="body">
<body>
<xsl:for-each-group select="*" group-starting-with="p[@class='subtitle']">
<xsl:choose>
<xsl:when test="@class="subtitle">
<div n="section{position()}">
<head>{.}</head>
<xsl:apply-templates select="tail(current-group())"/>
</div>
</xsl:when>
<xsl:otherwise>
<xsl:apply-templates select="current-group()"/>
</xsl:otherwise>
</xsl:choose>
</xsl:for-each-group>
</body>
</xsl:template>
请注意,xsl:for-each-group
需要 XSLT 2.0 或更高版本。使用 XSLT 1.0 要困难得多。
我四处寻找通过 XSL 的非扁平化程序,但 none 其中确实对我有用,尽管我相信我的情况非常简单。我有一个 HTML 的集合,总是相同的结构,我想通过 XSL 转换来展开。基本上,它是关于将 <p class='subtitle'>
之后的所有元素封装到 <div>
元素中,直到下一个 <p class='subtitle'>
,并且——理想情况下! – 仍然对 dividually 中的元素应用转换,但这是可选的(见下文)。
源文件看起来像:
[...some stuff on the page]
<p class='header'>Some text</p>
<p class='subtitle'>Subtitle 1</p>
<p class='content'>First paragraph of part 1, with some <span>Inside</span> and other
nested elements, on multiple levels</p>
<ul>a list with <li> inside</ul>
<p class='content'>Second paragraph of part 1</p>
<img src='xyz.jpg'/>
<p class='content'>Third paragraph of part 1</p>
<p class='subtitle'>Subtitle 2</p>
<p class='content'>First paragraph of part 2</p>
<p class='content'>Second paragraph of part 2</p>
<p class='subtitle'>Subtitle 3
[and so on…]
我想把它变成:
<div n='section1'>
<head>Subtitle 1</head>
<p>First paragraph of part 1, with some <span>Inside</span> and other and other
nested elements, on multiple levels</p>
<ul>a list with <li> inside</ul>
<p>Second paragraph of part 1</p>
<picture source='xyz.jpg'/>
<p>Third paragraph of part 1</p>
</div>
<div n="section2">
<head>Subtitle 2</head>
<p>First paragraph of part 2</p>
<p>Second paragraph of part 2</p>
</div>
<div n="Section 3">
<head>Subtitle 3</head>
[and so on…]
我找不到解决这个问题的方法。此外,如果第一步只是展开 HTML 文件(严格复制 div 中的元素而不进行转换),这已经很了不起了。
提前致谢!
这是一个经典的位置分组问题。入门指南:
<xsl:template match="body">
<body>
<xsl:for-each-group select="*" group-starting-with="p[@class='subtitle']">
<xsl:choose>
<xsl:when test="@class="subtitle">
<div n="section{position()}">
<head>{.}</head>
<xsl:apply-templates select="tail(current-group())"/>
</div>
</xsl:when>
<xsl:otherwise>
<xsl:apply-templates select="current-group()"/>
</xsl:otherwise>
</xsl:choose>
</xsl:for-each-group>
</body>
</xsl:template>
请注意,xsl:for-each-group
需要 XSLT 2.0 或更高版本。使用 XSLT 1.0 要困难得多。