如何对 python 中的单词列表使用词干算法
How to use Stemming algorithm for a list of words in python
我有一个单词表:
'AWS',
'jQuery',
'jQuery',
'Sliding',
'jQuery',
'jQuery',
'Manipulating',
'Us!'
我删除了常用词,需要应用词干提取以使词表更清晰。
我的删除常用词的代码:
raw2 = second_headers CORPUS = Common_word_corpus #my personal word corpus added here
corpus = [w.lower() for w in CORPUS]
processed_H2_tag = [w for w in raw2.split(' ') if w.lower() not in corpus]
print(processed_H2_tag)
这个怎么样?
# download wordnet
import nltk
nltk.download('wordnet')
# import these modules
from nltk.stem import WordNetLemmatizer
from nltk.corpus import wordnet
nltk.download('wordnet')
lemmatizer = WordNetLemmatizer()
# choose some words to be stemmed
words = ['AWS',
'jQuery',
'jQuery',
'Sliding',
'jQuery',
'jQuery',
'Manipulating',
'Manipulateing',
'Manipulate',
'Us!']
for w in words:
print(w, " : ", lemmatizer.lemmatize(w.lower(), pos=wordnet.VERB))
输出:
AWS : aws
jQuery : jquery
jQuery : jquery
Sliding : slide
jQuery : jquery
jQuery : jquery
Manipulating : manipulate
Manipulateing : manipulate
Manipulate : manipulate
Us! : us!
我有一个单词表:
'AWS',
'jQuery',
'jQuery',
'Sliding',
'jQuery',
'jQuery',
'Manipulating',
'Us!'
我删除了常用词,需要应用词干提取以使词表更清晰。
我的删除常用词的代码:
raw2 = second_headers CORPUS = Common_word_corpus #my personal word corpus added here
corpus = [w.lower() for w in CORPUS]
processed_H2_tag = [w for w in raw2.split(' ') if w.lower() not in corpus]
print(processed_H2_tag)
这个怎么样?
# download wordnet
import nltk
nltk.download('wordnet')
# import these modules
from nltk.stem import WordNetLemmatizer
from nltk.corpus import wordnet
nltk.download('wordnet')
lemmatizer = WordNetLemmatizer()
# choose some words to be stemmed
words = ['AWS',
'jQuery',
'jQuery',
'Sliding',
'jQuery',
'jQuery',
'Manipulating',
'Manipulateing',
'Manipulate',
'Us!']
for w in words:
print(w, " : ", lemmatizer.lemmatize(w.lower(), pos=wordnet.VERB))
输出:
AWS : aws
jQuery : jquery
jQuery : jquery
Sliding : slide
jQuery : jquery
jQuery : jquery
Manipulating : manipulate
Manipulateing : manipulate
Manipulate : manipulate
Us! : us!