AttributeError: 'list' object has no attribute 'isdigit'
AttributeError: 'list' object has no attribute 'isdigit'
我想在 pandas 中提取 POS。我做的如下
import pandas as pd
from nltk.tag import pos_tag
df = pd.DataFrame({'pos': ['noun', 'Alice', 'good', 'well', 'city']})
s = df['pos']
tagged_sent = pos_tag(s.str.split())
但得到回溯:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "../lib/python2.7/site-packages/nltk/tag/__init__.py", line 111, in pos_tag
return _pos_tag(tokens, tagset, tagger)
File "../lib/python2.7/site-packages/nltk/tag/__init__.py", line 82, in _pos_tag
tagged_tokens = tagger.tag(tokens)
File "/Users/mjpieters/Development/venvs/Whosebug-2.7/lib/python2.7/site-packages/nltk/tag/perceptron.py", line 152, in tag
context = self.START + [self.normalize(w) for w in tokens] + self.END
File "../lib/python2.7/site-packages/nltk/tag/perceptron.py", line 224, in normalize
elif word.isdigit() and len(word) == 4:
AttributeError: 'list' object has no attribute 'isdigit'
怎么了?
表达式 s.str.split()
是字符串的 list
,而不是字符串(pos_tag
所期望的)。因为 isdigit
是 str
的方法。
你实际上可以将 Series
对象直接传递给 pos_tag()
方法:
s = df['pos']
tagged_sent = pos_tag(s) # or pos_tag(s.tolist())
print(tagged_sent)
打印:
[('noun', 'JJ'), ('Alice', 'NNP'), ('good', 'JJ'), ('well', 'RB'), ('city', 'NN')]
我想在 pandas 中提取 POS。我做的如下
import pandas as pd
from nltk.tag import pos_tag
df = pd.DataFrame({'pos': ['noun', 'Alice', 'good', 'well', 'city']})
s = df['pos']
tagged_sent = pos_tag(s.str.split())
但得到回溯:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "../lib/python2.7/site-packages/nltk/tag/__init__.py", line 111, in pos_tag
return _pos_tag(tokens, tagset, tagger)
File "../lib/python2.7/site-packages/nltk/tag/__init__.py", line 82, in _pos_tag
tagged_tokens = tagger.tag(tokens)
File "/Users/mjpieters/Development/venvs/Whosebug-2.7/lib/python2.7/site-packages/nltk/tag/perceptron.py", line 152, in tag
context = self.START + [self.normalize(w) for w in tokens] + self.END
File "../lib/python2.7/site-packages/nltk/tag/perceptron.py", line 224, in normalize
elif word.isdigit() and len(word) == 4:
AttributeError: 'list' object has no attribute 'isdigit'
怎么了?
表达式 s.str.split()
是字符串的 list
,而不是字符串(pos_tag
所期望的)。因为 isdigit
是 str
的方法。
你实际上可以将 Series
对象直接传递给 pos_tag()
方法:
s = df['pos']
tagged_sent = pos_tag(s) # or pos_tag(s.tolist())
print(tagged_sent)
打印:
[('noun', 'JJ'), ('Alice', 'NNP'), ('good', 'JJ'), ('well', 'RB'), ('city', 'NN')]