创建一个 schematron 来标记拉丁语(等等,例如),但它也标记带有这些字母的单词

Creating a schematron to flag latinisms (etc, i.e, e.g) but its also flagging words with those letters in them

我创建了一个 schematron 来标记主题中的拉丁主义。它工作得有点太好了。它还会标记其中包含该字母组合的单词。例如,它需要标记“etc”但它也标记“ketchup”,因为 ketchup 在中间。我不知道要在我的代码中更改什么来制作它,所以它只标记实际的拉丁主义而不是其他词。

到目前为止,这是我的代码:

<sch:schema xmlns:sch="http://purl.oclc.org/dsdl/schematron"
queryBinding="xslt2">
   <sch:let name="words" value="' i.e, etc., e.g., vs, et al, circa'"/>
    <sch:let name="wordsToMatch" value="replace($words, ',', '|')"/>
    <sch:pattern id = "LatinismsCheck">
    <sch:rule context="text()">
        <sch:report test="matches(., $wordsToMatch)" role="warn">
            The following words should not be added in the topic:
            <sch:value-of select="$words"/>
           </sch:report>
        </sch:rule>
    </sch:pattern>
</sch:schema>

也许你可以在正则表达式中用'\b'标记单词边界。像这样:

<sch:schema xmlns:sch="http://purl.oclc.org/dsdl/schematron"
queryBinding="xslt2" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<sch:let name="words" value="'i.e.,etc.,e.g.'"/>
<sch:let name="wordsToMatch">
    <xsl:for-each select="tokenize($words,',')">
        <xsl:value-of select="concat('(\b', normalize-space(.), ')')"/>
        <xsl:if test="position() != last()">
            <xsl:value-of select="'|'"/>
        </xsl:if>
    </xsl:for-each>
</sch:let>

<sch:pattern>
    <sch:rule context="text()">
        <sch:report test="matches(., string($wordsToMatch), ';j')" role="warn">
            The following words should not be added in the topic: <sch:value-of select="$words"/>
        </sch:report>
    </sch:rule>
</sch:pattern></sch:schema>