使用 XSLT 缩写带空格的文本
Abbreviating text with whitespace with XSLT
我想从文本中提取一些简短的引理来做一些解释性的注释。也就是说,如果文本太长,它应该只输出第一个和最后一个词。这有效:
<?xml version="1.0" encoding="UTF-8"?>
<lemma>
<a><b>I</b> can what I can and <b><c>what</c></b> I can't I can</a>
</lemma>
应用此 xslt 时
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
version="2.0">
<xsl:output method="xml" encoding="utf-8" indent="yes"/>
<!-- Identity template : copy all text nodes, elements and attributes -->
<xsl:template match="@*|node()">
<xsl:copy copy-namespaces="no">
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="lemma">
<xsl:value-of select="."/>
<xsl:choose>
<xsl:when test="string-length(normalize-space(a)) > 20">
<xsl:value-of select="tokenize(a,' ')[1]"/>
<xsl:text> […] </xsl:text>
<xsl:value-of select="tokenize(a,' ')[last()]"/>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="a"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
产生所需的输出:
I can what I can and what I can't I can
I […] can
不幸的是,每当两个子元素紧邻时,中间的 space 被编码为名为“space”的子节点。上述解决方案不适用于:
<lemma>
<a><b>I</b><space/><b>can</b> what I can and what I can't I can</a>
</lemma>
我之前尝试过处理单个 space-特殊字符,但这不起作用(我知道为什么),我只是不知道如何做得更好。我想它可以与两个 XLST 运行一起使用。
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
version="2.0">
<xsl:output method="xml" encoding="utf-8" indent="yes"/>
<!-- Identity template : copy all text nodes, elements and attributes -->
<xsl:template match="@*|node()">
<xsl:copy copy-namespaces="no">
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="space">
 
</xsl:template>
<xsl:template match="lemma">
<xsl:apply-templates select="space"/>
<xsl:value-of select="."/>
<xsl:choose>
<xsl:when test="string-length(normalize-space(a)) > 20">
<xsl:value-of select="tokenize(a,' ')[1]"/>
<xsl:text> […] </xsl:text>
<xsl:value-of select="tokenize(a,' ')[last()]"/>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="a"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
输出:
Ican what I can and what I can't I can
Ican […] can
您可以执行 xsl:apply-templates
来处理 a
并将其保存在变量中...
XML 输入
<doc>
<lemma>
<a><b>I</b> can what I can and <b><c>what</c></b> I can't I can</a>
</lemma>
<lemma>
<a><b>I</b><space/><b>can</b> what I can and what I can't I can</a>
</lemma>
</doc>
XSLT 2.0
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="space">
<xsl:text> </xsl:text>
</xsl:template>
<xsl:template match="lemma">
<xsl:variable name="a">
<xsl:apply-templates select="a"/>
</xsl:variable>
<xsl:variable name="norm" select="normalize-space($a)"/>
<xsl:variable name="tokens" select="tokenize($norm,'\s')"/>
<xsl:copy>
<result>
<xsl:value-of select="$norm"/>
</result>
<result>
<xsl:value-of select="
if (string-length($norm) > 20) then
concat($tokens[1],' […] ', $tokens[last()])
else $norm"/>
</result>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
XML输出
<doc>
<lemma>
<result>I can what I can and what I can't I can</result>
<result>I […] can</result>
</lemma>
<lemma>
<result>I can what I can and what I can't I can</result>
<result>I […] can</result>
</lemma>
</doc>
我想从文本中提取一些简短的引理来做一些解释性的注释。也就是说,如果文本太长,它应该只输出第一个和最后一个词。这有效:
<?xml version="1.0" encoding="UTF-8"?>
<lemma>
<a><b>I</b> can what I can and <b><c>what</c></b> I can't I can</a>
</lemma>
应用此 xslt 时
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
version="2.0">
<xsl:output method="xml" encoding="utf-8" indent="yes"/>
<!-- Identity template : copy all text nodes, elements and attributes -->
<xsl:template match="@*|node()">
<xsl:copy copy-namespaces="no">
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="lemma">
<xsl:value-of select="."/>
<xsl:choose>
<xsl:when test="string-length(normalize-space(a)) > 20">
<xsl:value-of select="tokenize(a,' ')[1]"/>
<xsl:text> […] </xsl:text>
<xsl:value-of select="tokenize(a,' ')[last()]"/>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="a"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
产生所需的输出:
I can what I can and what I can't I can
I […] can
不幸的是,每当两个子元素紧邻时,中间的 space 被编码为名为“space”的子节点。上述解决方案不适用于:
<lemma>
<a><b>I</b><space/><b>can</b> what I can and what I can't I can</a>
</lemma>
我之前尝试过处理单个 space-特殊字符,但这不起作用(我知道为什么),我只是不知道如何做得更好。我想它可以与两个 XLST 运行一起使用。
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
version="2.0">
<xsl:output method="xml" encoding="utf-8" indent="yes"/>
<!-- Identity template : copy all text nodes, elements and attributes -->
<xsl:template match="@*|node()">
<xsl:copy copy-namespaces="no">
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="space">
 
</xsl:template>
<xsl:template match="lemma">
<xsl:apply-templates select="space"/>
<xsl:value-of select="."/>
<xsl:choose>
<xsl:when test="string-length(normalize-space(a)) > 20">
<xsl:value-of select="tokenize(a,' ')[1]"/>
<xsl:text> […] </xsl:text>
<xsl:value-of select="tokenize(a,' ')[last()]"/>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="a"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
输出:
Ican what I can and what I can't I can
Ican […] can
您可以执行 xsl:apply-templates
来处理 a
并将其保存在变量中...
XML 输入
<doc>
<lemma>
<a><b>I</b> can what I can and <b><c>what</c></b> I can't I can</a>
</lemma>
<lemma>
<a><b>I</b><space/><b>can</b> what I can and what I can't I can</a>
</lemma>
</doc>
XSLT 2.0
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="space">
<xsl:text> </xsl:text>
</xsl:template>
<xsl:template match="lemma">
<xsl:variable name="a">
<xsl:apply-templates select="a"/>
</xsl:variable>
<xsl:variable name="norm" select="normalize-space($a)"/>
<xsl:variable name="tokens" select="tokenize($norm,'\s')"/>
<xsl:copy>
<result>
<xsl:value-of select="$norm"/>
</result>
<result>
<xsl:value-of select="
if (string-length($norm) > 20) then
concat($tokens[1],' […] ', $tokens[last()])
else $norm"/>
</result>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
XML输出
<doc>
<lemma>
<result>I can what I can and what I can't I can</result>
<result>I […] can</result>
</lemma>
<lemma>
<result>I can what I can and what I can't I can</result>
<result>I […] can</result>
</lemma>
</doc>