将 xslt 标记化函数应用于应用模板的结果
Apply xslt tokenize function to results of apply-templates
我有一个 XML 的块,格式如下:
<line n="2">
<orig>of right hool herte <ex>&</ex> in our<ex>e</ex><note place="bottom" anchored="true" xml:id="explanatory">Although “r” on the painted panels of the chapel is consistently written with an otiose mark when it concludes a word, the mark here is rendered more heavily and with a dot indicating suspension above the r. This rendering as “our<ex>e</ex>” is a linguistic outlier for the area based on the electronic <emph rend="italic">Linguistic Atlas of Late Medieval English</emph>’s linguistic profiles for “oure,” “our,” and “our<ex>e</ex>.” See eLALME's <ref target="http://archive.ling.ed.ac.uk/ihd/elalme_scripts/mapping/user-defined_maps.html">User Defined Maps</ref> for more information. Unfortunately the current online version (as of 12 July 2014) does not allow direct linking between static dotmaps and linguistic profiles.</note> best entent</orig>
</line>
我需要能够将其简化为纯文本:"of right hool herte & in oure best entent,",然后对 space 进行标记化以获取逗号或标记分隔值的列表。我通过以下 xslt 完成了纯文本的位:
<xsl:template match="tei:line" >
<xsl:apply-templates />
</xsl:template>
<xsl:template match="orig">
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="ex">
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="note"/>
但是,我无法让 tokenize 函数与应用模板一起使用。如果我尝试改用 value-of,那么标签下面的标签将不再正常工作。有没有一种方法可以 运行 xml 上的应用模板,然后在单个 xslt 中标记每个元素?谢谢!
您不需要 tokenize()
来获得此输出:
of right hool herte & in oure best entent
恒等变换和压制模板 note
将为您完成:
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="note"/>
</xsl:stylesheet>
如果你想让它以逗号分隔,你可以将上面的文本输出捕获到一个变量中,然后像你提到的那样应用tokenize
:
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:variable name="result">
<xsl:apply-templates/>
</xsl:variable>
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="note"/>
<xsl:template match="/">
<xsl:value-of select="string-join(tokenize(normalize-space($result), ' '), ',')"/>
</xsl:template>
</xsl:stylesheet>
根据您的输入 XML,上述 XSLT 将生成以下文本:
of,right,hool,herte,&,in,oure,best,entent
I need to be able to reduce it to just the plaintext: "of right hool
herte & in oure best entent," and then tokenize on the space to get a
list of either comma or tag-separated values.
不确定 "tag-separated values" 是什么意思。给定以下测试输入:
XML
<root>
<line n="2">
<orig>of right hool herte <ex>&</ex> in our<ex>e</ex><note place="bottom" anchored="true" xml:id="explanatory">Although “r” on the painted panels of the chapel is consistently written with an otiose mark when it concludes a word, the mark here is rendered more heavily and with a dot indicating suspension above the r. This rendering as “our<ex>e</ex>” is a linguistic outlier for the area based on the electronic <emph rend="italic">Linguistic Atlas of Late Medieval English</emph>’s linguistic profiles for “oure,” “our,” and “our<ex>e</ex>.” See eLALME's <ref target="http://archive.ling.ed.ac.uk/ihd/elalme_scripts/mapping/user-defined_maps.html">User Defined Maps</ref> for more information. Unfortunately the current online version (as of 12 July 2014) does not allow direct linking between static dotmaps and linguistic profiles.</note> best entent</orig>
</line>
</root>
以下样式表:
XSLT 2.0
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="/root">
<xsl:copy>
<xsl:apply-templates select="line"/>
</xsl:copy>
</xsl:template>
<xsl:template match="line">
<xsl:variable name="line-text">
<xsl:apply-templates/>
</xsl:variable>
<xsl:copy>
<xsl:copy-of select="@n"/>
<xsl:value-of select="tokenize(normalize-space($line-text), ' ')" separator=", "/>
</xsl:copy>
</xsl:template>
<xsl:template match="note"/>
</xsl:stylesheet>
将return:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<line n="2">of, right, hool, herte, &, in, oure, best, entent</line>
</root>
我有一个 XML 的块,格式如下:
<line n="2">
<orig>of right hool herte <ex>&</ex> in our<ex>e</ex><note place="bottom" anchored="true" xml:id="explanatory">Although “r” on the painted panels of the chapel is consistently written with an otiose mark when it concludes a word, the mark here is rendered more heavily and with a dot indicating suspension above the r. This rendering as “our<ex>e</ex>” is a linguistic outlier for the area based on the electronic <emph rend="italic">Linguistic Atlas of Late Medieval English</emph>’s linguistic profiles for “oure,” “our,” and “our<ex>e</ex>.” See eLALME's <ref target="http://archive.ling.ed.ac.uk/ihd/elalme_scripts/mapping/user-defined_maps.html">User Defined Maps</ref> for more information. Unfortunately the current online version (as of 12 July 2014) does not allow direct linking between static dotmaps and linguistic profiles.</note> best entent</orig>
</line>
我需要能够将其简化为纯文本:"of right hool herte & in oure best entent,",然后对 space 进行标记化以获取逗号或标记分隔值的列表。我通过以下 xslt 完成了纯文本的位:
<xsl:template match="tei:line" >
<xsl:apply-templates />
</xsl:template>
<xsl:template match="orig">
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="ex">
<xsl:apply-templates/>
</xsl:template>
<xsl:template match="note"/>
但是,我无法让 tokenize 函数与应用模板一起使用。如果我尝试改用 value-of,那么标签下面的标签将不再正常工作。有没有一种方法可以 运行 xml 上的应用模板,然后在单个 xslt 中标记每个元素?谢谢!
您不需要 tokenize()
来获得此输出:
of right hool herte & in oure best entent
恒等变换和压制模板 note
将为您完成:
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="note"/>
</xsl:stylesheet>
如果你想让它以逗号分隔,你可以将上面的文本输出捕获到一个变量中,然后像你提到的那样应用tokenize
:
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:variable name="result">
<xsl:apply-templates/>
</xsl:variable>
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="note"/>
<xsl:template match="/">
<xsl:value-of select="string-join(tokenize(normalize-space($result), ' '), ',')"/>
</xsl:template>
</xsl:stylesheet>
根据您的输入 XML,上述 XSLT 将生成以下文本:
of,right,hool,herte,&,in,oure,best,entent
I need to be able to reduce it to just the plaintext: "of right hool herte & in oure best entent," and then tokenize on the space to get a list of either comma or tag-separated values.
不确定 "tag-separated values" 是什么意思。给定以下测试输入:
XML
<root>
<line n="2">
<orig>of right hool herte <ex>&</ex> in our<ex>e</ex><note place="bottom" anchored="true" xml:id="explanatory">Although “r” on the painted panels of the chapel is consistently written with an otiose mark when it concludes a word, the mark here is rendered more heavily and with a dot indicating suspension above the r. This rendering as “our<ex>e</ex>” is a linguistic outlier for the area based on the electronic <emph rend="italic">Linguistic Atlas of Late Medieval English</emph>’s linguistic profiles for “oure,” “our,” and “our<ex>e</ex>.” See eLALME's <ref target="http://archive.ling.ed.ac.uk/ihd/elalme_scripts/mapping/user-defined_maps.html">User Defined Maps</ref> for more information. Unfortunately the current online version (as of 12 July 2014) does not allow direct linking between static dotmaps and linguistic profiles.</note> best entent</orig>
</line>
</root>
以下样式表:
XSLT 2.0
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="/root">
<xsl:copy>
<xsl:apply-templates select="line"/>
</xsl:copy>
</xsl:template>
<xsl:template match="line">
<xsl:variable name="line-text">
<xsl:apply-templates/>
</xsl:variable>
<xsl:copy>
<xsl:copy-of select="@n"/>
<xsl:value-of select="tokenize(normalize-space($line-text), ' ')" separator=", "/>
</xsl:copy>
</xsl:template>
<xsl:template match="note"/>
</xsl:stylesheet>
将return:
<?xml version="1.0" encoding="UTF-8"?>
<root>
<line n="2">of, right, hool, herte, &, in, oure, best, entent</line>
</root>