尝试从 'spacy.pipeline' 包导入 'SentenceSegmenter' 时如何解决导入问题?

How to solve the problem of importing when trying to import 'SentenceSegmenter' from 'spacy.pipeline' package?

ImportError: cannot import name 'SentenceSegmenter' from 'spacy.pipeline' 

Spacy 版本:3.2.1

我知道这个 class 适用于较早版本的 spacy,但是对于这个版本的 spacy 会有类似的东西吗?

在 spacy 中有几种方法可以进行句子分割。您可以在此处的文档中阅读这些内容:https://spacy.io/usage/linguistic-features#sbd.

此示例是从文档中按原样复制的,展示了如何根据英语语言模型对句子进行切分。

import spacy

nlp = spacy.load("en_core_web_sm")
doc = nlp("This is a sentence. This is another sentence.")
assert doc.has_annotation("SENT_START")
for sent in doc.sents:
    print(sent.text)

您也可以使用基于规则的规则来仅根据语言执行标点符号拆分,就像这样(也来自文档):

import spacy
from spacy.lang.en import English

nlp = English()  # just the language with no pipeline
nlp.add_pipe("sentencizer")
doc = nlp("This is a sentence. This is another sentence.")
for sent in doc.sents:
    print(sent.text)

这应该适用于 spacy 3.0.5 及更高版本。