Lucene Tokenizer 已弃用

Lucene Tokenizer deprecated

以下分析器扩展有许多已弃用的子 类。什么是未弃用的替代品?对于 StandardTokenizerStandardFilterLowerCaseFilterStopFilter -- 如下所用。

public class PorterAnalyzer extends Analyzer {

  private final Version version;

  public PorterAnalyzer(Version version) {
    this.version = version;
  }

  @Override
  @SuppressWarnings("resource")
  protected TokenStreamComponents createComponents(String fieldName, Reader reader) {
    final StandardTokenizer src = new StandardTokenizer(version, reader);
    TokenStream tok = new StandardFilter(version, src);
    tok = new LowerCaseFilter(version, tok);
    tok = new StopFilter(version, tok, StandardAnalyzer.STOP_WORDS_SET);
    tok = new PorterStemFilter(tok);
    return new TokenStreamComponents(src, tok);
  }

}

只是丢失了版本参数。


我假设您使用的是 Lucene 版本 4.10 或附近的版本。所有这些带有版本参数的构造函数已被弃用(并从 5.0 版开始删除),并替换为不采用该参数的构造函数。

public class PorterAnalyzer extends Analyzer {
  @Override
  @SuppressWarnings("resource")
  protected TokenStreamComponents createComponents(String fieldName, Reader reader) {
    final StandardTokenizer src = new StandardTokenizer(reader);
    TokenStream tok = new StandardFilter(src);
    tok = new LowerCaseFilter(tok);
    tok = new StopFilter(tok, StandardAnalyzer.STOP_WORDS_SET);
    tok = new PorterStemFilter(tok);
    return new TokenStreamComponents(src, tok);
  }
}