Solr - 在短语搜索中包括 emdash

Solr - including emdash in phrase search

例如,我已经为两个包含文本的文档编制了索引:

doc1 - Test—emdash

doc2 - Test without emdash

我正在查询“—emdash”是 return 两个文档,我应该如何配置以便它 return 只查询具有确切短语的文档?

用于索引的字段类型是textgen

<fieldType name="textgen" class="solr.TextField" positionIncrementGap="100">
  <analyzer type="index">
    <tokenizer class="solr.WhitespaceTokenizerFactory"/>
    <!-- <filter class="solr.ASCIIFoldingFilterFactory"/> -->
    <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />
    <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="0"/>
    <filter class="solr.LowerCaseFilterFactory"/>
  </analyzer>
  <analyzer type="query">
    <tokenizer class="solr.WhitespaceTokenizerFactory"/>
    <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
    <filter class="solr.StopFilterFactory"
            ignoreCase="true"
            words="stopwords.txt"
            enablePositionIncrements="true"
            />
    <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="0"/>
    <filter class="solr.LowerCaseFilterFactory"/>
  </analyzer>
</fieldType>

1) 删除 WordDelimiter 过滤器 2) 添加映射字符过滤器:“-”=>“-”