创建一个 schematron 来标记拉丁语(等等,例如),但它也标记带有这些字母的单词
Creating a schematron to flag latinisms (etc, i.e, e.g) but its also flagging words with those letters in them
我创建了一个 schematron 来标记主题中的拉丁主义。它工作得有点太好了。它还会标记其中包含该字母组合的单词。例如,它需要标记“etc”但它也标记“ketchup”,因为 ketchup 在中间。我不知道要在我的代码中更改什么来制作它,所以它只标记实际的拉丁主义而不是其他词。
到目前为止,这是我的代码:
<sch:schema xmlns:sch="http://purl.oclc.org/dsdl/schematron"
queryBinding="xslt2">
<sch:let name="words" value="' i.e, etc., e.g., vs, et al, circa'"/>
<sch:let name="wordsToMatch" value="replace($words, ',', '|')"/>
<sch:pattern id = "LatinismsCheck">
<sch:rule context="text()">
<sch:report test="matches(., $wordsToMatch)" role="warn">
The following words should not be added in the topic:
<sch:value-of select="$words"/>
</sch:report>
</sch:rule>
</sch:pattern>
</sch:schema>
也许你可以在正则表达式中用'\b'标记单词边界。像这样:
<sch:schema xmlns:sch="http://purl.oclc.org/dsdl/schematron"
queryBinding="xslt2" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<sch:let name="words" value="'i.e.,etc.,e.g.'"/>
<sch:let name="wordsToMatch">
<xsl:for-each select="tokenize($words,',')">
<xsl:value-of select="concat('(\b', normalize-space(.), ')')"/>
<xsl:if test="position() != last()">
<xsl:value-of select="'|'"/>
</xsl:if>
</xsl:for-each>
</sch:let>
<sch:pattern>
<sch:rule context="text()">
<sch:report test="matches(., string($wordsToMatch), ';j')" role="warn">
The following words should not be added in the topic: <sch:value-of select="$words"/>
</sch:report>
</sch:rule>
</sch:pattern></sch:schema>
我创建了一个 schematron 来标记主题中的拉丁主义。它工作得有点太好了。它还会标记其中包含该字母组合的单词。例如,它需要标记“etc”但它也标记“ketchup”,因为 ketchup 在中间。我不知道要在我的代码中更改什么来制作它,所以它只标记实际的拉丁主义而不是其他词。
到目前为止,这是我的代码:
<sch:schema xmlns:sch="http://purl.oclc.org/dsdl/schematron"
queryBinding="xslt2">
<sch:let name="words" value="' i.e, etc., e.g., vs, et al, circa'"/>
<sch:let name="wordsToMatch" value="replace($words, ',', '|')"/>
<sch:pattern id = "LatinismsCheck">
<sch:rule context="text()">
<sch:report test="matches(., $wordsToMatch)" role="warn">
The following words should not be added in the topic:
<sch:value-of select="$words"/>
</sch:report>
</sch:rule>
</sch:pattern>
</sch:schema>
也许你可以在正则表达式中用'\b'标记单词边界。像这样:
<sch:schema xmlns:sch="http://purl.oclc.org/dsdl/schematron"
queryBinding="xslt2" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<sch:let name="words" value="'i.e.,etc.,e.g.'"/>
<sch:let name="wordsToMatch">
<xsl:for-each select="tokenize($words,',')">
<xsl:value-of select="concat('(\b', normalize-space(.), ')')"/>
<xsl:if test="position() != last()">
<xsl:value-of select="'|'"/>
</xsl:if>
</xsl:for-each>
</sch:let>
<sch:pattern>
<sch:rule context="text()">
<sch:report test="matches(., string($wordsToMatch), ';j')" role="warn">
The following words should not be added in the topic: <sch:value-of select="$words"/>
</sch:report>
</sch:rule>
</sch:pattern></sch:schema>