Lucene.NET TextField 未被编入索引

Question

使用 .NET 6.0 和来自 NuGet 的 Lucene.NET-4.8.0-beta00016

我在从网站实施 quickstart example 时遇到问题。在文档中使用 TextField 时，该字段未编入索引。稍后在 BuildIndex 方法中进行的搜索不会检索到任何结果。如果将 TextField 更改为 StringField，则该示例有效并且搜索 returns 有效结果。

为什么 StringField 有效而 TextField 无效？我读到 StringField 未被分析，但 TextField 被分析，所以这可能与 StandardAnalyzer 有关？

public class LuceneFullTextSearchService {

private readonly IndexWriter _writer;
private readonly Analyzer _standardAnalyzer;

public LuceneFullTextSearchService(string indexName)
{
    // Compatibility version
    const LuceneVersion luceneVersion = LuceneVersion.LUCENE_48;
    string indexPath = Path.Combine(Environment.CurrentDirectory, indexName);
    Directory indexDir = FSDirectory.Open(indexPath);

    // Create an analyzer to process the text 
    _standardAnalyzer = new StandardAnalyzer(luceneVersion);

    // Create an index writer
    IndexWriterConfig indexConfig = new IndexWriterConfig(luceneVersion, _standardAnalyzer)
    {
        OpenMode = OpenMode.CREATE_OR_APPEND,
    };
    _writer = new IndexWriter(indexDir, indexConfig);
}

public void BuildIndex(string searchPath)
{
    Document doc = new Document();
    
    TextField docText = new TextField("title", "Apache", Field.Store.YES); 
    doc.Add(docText);
    
    _writer.AddDocument(doc);

    //Flush and commit the index data to the directory
    _writer.Commit();
    
    // Parse the user's query text
    Query query = new TermQuery(new Term("title", "Apache"));
    
    // Search
    using DirectoryReader reader = _writer.GetReader(applyAllDeletes: true);
    IndexSearcher searcher = new IndexSearcher(reader);
    TopDocs topDocs = searcher.Search(query, n: 2);

    // Show results
    Document resultDoc = searcher.Doc(topDocs.ScoreDocs[0].Doc);
    string title = resultDoc.Get("title");
}
}

Answer 1

StandardAnalyzer includes a LowerCaseFilter，因此您的文本在索引中存储为 lower-case。

但是，当您构建查询时，您使用的文本是“Apache”而不是“apache”，因此它不会产生任何匹配。

// Parse the user's query text
Query query = new TermQuery(new Term("title", "Apache"));

选项 1

小写您的搜索字词。

// Parse the user's query text
Query query = new TermQuery(new Term("title", "Apache".ToLowerInvariant()));

选项 2

将 QueryParser 与用于构建索引的 相同的 分析器一起使用。

QueryParser parser = new QueryParser(luceneVersion, "title", _standardAnalyzer);
Query query = parser.Parse("Apache");

Lucene.Net.QueryParser package包含几个实现（上面的例子使用了Lucene.Net.QueryParsers.Classic.QueryParser）。

Lucene.NET TextField 未被编入索引

Lucene.NET TextField not being indexed

c#

lucene

lucene.net

选项 1

选项 2