SpaCy 中的 .pos_ 未在 Python 中返回任何结果

Question

我真的是编程新手，python 我一直在尝试在我的 python 3.x 中使用 SpaCy。但是，当我尝试将 .pos_ 应用于文本以查找词性时，我没有得到任何词性结果。我已经确保 SpaCy 已正确安装并浏览了其他 Whosebug posts 和这个 one github post 但是它没有帮助。

这是我使用的代码：

from spacy.lang.en import English
parser = English()

tokens = parser('She ran')
dir(tokens[0])
print(dir(tokens[0]))


def show_POS(text):
    tokens = parser(text)
    for token in tokens:
       print(token.text, token.pos_)


show_POS("She hit the wall.")


def show_dep(text):
    tokens = parser(text)
    for token in tokens:
        print(" {} : {} : {} :{}".format(token.orth_,token.pos_,token.dep_,token.head))


print("token : POS : dep. : head")
print("-------------------------")
show_dep("She hit the wall.")

ex1 = parser("he drinks a water")
for word in ex1:
print(word.text,word.pos_)

这是输出：

/Users/dalals4/PycharmProjects/NLP-LEARNING/venv/bin/python 
/Users/dalals4/PycharmProjects/NLP_learning_practice_chp5.py
['_', '__bytes__', '__class__', '__delattr__', '__dir__', '__doc__', 
'__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', 
'__hash__', '__init__', '__init_subclass__', '__le__', '__len__', 
'__lt__', '__ne__', '__new__', '__pyx_vtable__', '__reduce__', 
'__reduce_ex__', '__repr__', '__setattr__', '__setstate__', 
'__sizeof__', '__str__', '__subclasshook__', '__unicode__', 
'ancestors', 'check_flag', 'children', 'cluster', 'conjuncts', 'dep', 
'dep_', 'doc', 'ent_id', 'ent_id_', 'ent_iob', 'ent_iob_', 'ent_type', 
'ent_type_', 'get_extension', 'has_extension', 'has_vector', 'head', 
'i', 'idx', 'is_alpha', 'is_ancestor', 'is_ascii', 'is_bracket', 
'is_currency', 'is_digit', 'is_left_punct', 'is_lower', 'is_oov', 
'is_punct', 'is_quote', 'is_right_punct', 'is_sent_start', 'is_space', 
'is_stop', 'is_title', 'is_upper', 'lang', 'lang_', 'left_edge', 
'lefts', 'lemma', 'lemma_', 'lex_id', 'like_email', 'like_num', 
'like_url', 'lower', 'lower_', 'n_lefts', 'n_rights', 'nbor', 'norm', 
'norm_', 'orth', 'orth_', 'pos', 'pos_', 'prefix', 'prefix_', 'prob', 
'rank', 'right_edge', 'rights', 'sent_start', 'sentiment', 
'set_extension', 'shape', 'shape_', 'similarity', 'string', 'subtree', 
'suffix', 'suffix_', 'tag', 'tag_', 'text', 'text_with_ws', 'vector', 
'vector_norm', 'vocab', 'whitespace_']
She 
hit 
the 
wall 
. 
token : POS : dep. : head
-------------------------
 She :  :  : She
 hit :  :  : hit
 the :  :  : the
 wall :  :  : wall
 . :  :  : .
he 
drinks 
a 
water 

Process finished with exit code 0

如有任何帮助，我们将不胜感激！提前非常感谢您:)

Answer 1

这里的问题是您只导入英语 语言 class，其中包括特定于语言的数据，例如标记化规则。但是您实际上并没有加载模型，它使 spaCy 能够预测词性标签和其他语言注释。

如果您还没有这样做，您首先需要 installed a model package，例如英文小模特：

python -m spacy download en_core_web_sm

然后您可以通过调用 spacy.load:

告诉 spaCy 加载它

import spacy

nlp = spacy.load('en_core_web_sm')
doc = nlp(u"she ran")
for token in doc:
    print(token.text, token.pos_)

这将为您提供 English class 的实例，其中加载了模型权重，因此 spaCy 可以预测词性标签、依赖标签和命名实体。

如果您是 spaCy 的新手，我建议您查看文档中的 the spaCy 101 guide。它解释了最重要的概念，并包含许多示例，您可以运行.

SpaCy 中的 .pos_ 未在 Python 中返回任何结果

.pos_ in SpaCy is not returning any results in Python

python

spacy