使用 SpaCy Displacy 可视化定制的 NER 标签

Visualizing customized NER tags with SpaCy Displacy

我是 spaCy 的新手,Python 我想使用这个库可视化一个 NER。这是我找到的示例:

import spacy
from spacy import displacy

NER = spacy.load("en_core_web_sm")

raw_text="The Indian Space Research Organisation or is the national space agency of India, headquartered in Bengaluru. It operates under Department of Space which is directly overseen by the Prime Minister of India while Chairman of ISRO acts as executive of DOS as well."

text1= NER(raw_text)

displacy.render(text1,style="ent",jupyter=True)

The Example of Visualization

但是,我已经有了自定义标签及其位置的列表:

 [812, 834, "POS"], [838, 853, "ORG"], [870, 888, "POS"], [892, 920, "ORG"], [925, 929, "ENGLEVEL"], [987, 1002, "SKILL"],...

我希望使用我自己的自定义标签和实体来可视化我的文本,而不是使用 spaCy 的默认 NER 选项。我怎样才能做到这一点?

您将需要添加表示实体的字符跨度并将它们附加到您的文档对象。像这样:

import spacy
from spacy import displacy

nlp = spacy.blank('en')
raw_text = "The Indian Space Research Organisation or is the national space agency of India, headquartered in Bengaluru. It operates under Department of Space which is directly overseen by the Prime Minister of India while Chairman of ISRO acts as executive of DOS as well."
doc = nlp.make_doc(raw_text)
spans = [[812, 834, "POS"], [838, 853, "ORG"], [870, 888, "POS"], [892, 920, "ORG"], [925, 929, "ENGLEVEL"],
         [987, 1002, "SKILL"]]
ents = []
for span_start, span_end, label in spans:
    ent = doc.char_span(span_start, span_end, label=label)
    if ent is None:
        continue

    ents.append(ent)

doc.ents = ents
displacy.render(doc, style="ent", jupyter=True)

相应地更改您的 raw_textspans。如果您给出的跨度开始或结束超出文本长度 doc.char_span() returns None,那么您需要适当地处理它。