如何在whoosh中return匹配我们搜索关键词的对应行？

Question

假设给定文件 a.txt:

hello world
good morning world
good night world

鉴于我要搜索的关键字是 morning，我想使用 whoosh python 库来 return 匹配关键字 morning 的行文本文件 a.txt。所以，它将 return good morning world。我怎样才能做到这一点？

更新：这是我的架构：

schema = Schema(title=TEXT(stored=True),
              path=ID(stored=True),
              content=TEXT(stored=True))

然后我将作者 add_document 添加到内容字段

Answer 1

每行索引文本文件并将行号存储为 NUMERIC 字段，整行存储为 ID 字段（存储很便宜，对吧！）。

类似于以下内容（未经测试）：

schema = Schema(
    title=TEXT(stored=True),
    path=ID(stored=True),
    content=TEXT(stored=True),
    line_number=NUMERIC(int, 32, stored=True, signed=False),
    line_text=ID(stored=True),
)


ix = index.open_dir("index")
writer = ix.writer()

with open('a.txt') as f:
    for line_number, line in enumerate(f):
        writer.add_document(
            title='This is a title',
            path='a.txt',
            content=line,
            line_number=line_number,
            line_text=line,
        )

很明显，您可以将其扩展为索引多个文本文件：

files_to_index = [
    {'title': 'Title A', 'path': 'a.txt'},
    {'title': 'Title B', 'path': 'b.txt'},
    {'title': 'Title C', 'path': 'c.txt'},
]

ix = index.open_dir("index")
writer = ix.writer()


for file_to_index in files_to_index:

    with open(file_to_index['path']) as f:
        for line_number, line in enumerate(f):
            writer.add_document(
                title=file_to_index['title'],
                path=file_to_index['path'],
                content=line,
                line_number=line_number,
                line_text=line,
            )

如何在whoosh中return匹配我们搜索关键词的对应行？

How to return the corresponding line that matches our search keyword in whoosh?

python

whoosh

python-3.x