NLTK 的拼写检查器无法正常工作

NLTK's Spell checker is not working correctly

我想用 NLTK 检查 python 中某个句子的拼写。内置 spell checker 无法正常工作。它给出 with 和 'and' 作为错误的拼写。

def tokens(sent):
        return nltk.word_tokenize(sent)

def SpellChecker(line):
        for i in tokens(line):
            strip = i.rstrip()
            if not WN.synsets(strip):
                print("Wrong spellings : " +i)
            else: 
                print("No mistakes :" + i)

def removePunct(str):
        return  "".join(c for c in str if c not in ('!','.',':',','))

l = "Attempting artiness With black & white and clever camera angles, the movie disappointed - became even more ridiculous - as the acting was poor and the plot and lines almost non-existent. "
noPunct = removePunct(l.lower())
if(SpellChecker(noPunct)):
        print(l)
        print(noPunct)

有人能告诉我原因吗?

拼写错误,因为那些 stopwords 不包含在 wordnet 中(检查 FAQs

因此,您可以改用 NLTK 语料库中的停用词来检查此类词。

#Add these lines:
import nltk
from nltk.corpus import wordnet as WN
from nltk.corpus import stopwords
stop_words_en = set(stopwords.words('english'))

def tokens(sent):
        return nltk.word_tokenize(sent)

def SpellChecker(line):
    for i in tokens(line):
        strip = i.rstrip()
        if not WN.synsets(strip):
            if strip in stop_words_en:    # <--- Check whether it's in stopword list
                print("No mistakes :" + i)
            else:
                print("Wrong spellings : " +i)
        else: 
            print("No mistakes :" + i)


def removePunct(str):
        return  "".join(c for c in str if c not in ('!','.',':',','))

l = "Attempting artiness With black & white and clever camera angles, the movie disappointed - became even more ridiculous - as the acting was poor and the plot and lines almost non-existent. "

noPunct = removePunct(l.lower())
if(SpellChecker(noPunct)):
        print(l)
        print(noPunct)