python 中有什么方法可以自动更正单列 excel 文件的多行拼写错误吗？

Question

我正在为一个大学项目做情绪分析。我有一个 excel 文件，其中包含一个名为 "comments" 的 "column"，它有“1000 行”。这些行中的句子有拼写错误，为了分析，我需要更正它们。我不知道如何处理这个问题，以便我使用 python 代码获得并列有正确的句子。

我找到的所有方法都是纠正单词的拼写错误，而不是句子，而不是在具有 100 行的列级别上。

Answer 1

您可以使用拼写检查器来完成您的工作

import pandas as pd
from spellchecker import SpellChecker

spell  = SpellChecker()

df = pd.DataFrame(['hooww good mrning playing fotball studyiing hard'], columns = ['text'])

def spell_check(x):
    correct_word = []
    mispelled_word = x.split()
    for word in mispelled_word:
        correct_word.append(spell.correction(word))
    return ' '.join(correct_word)


df['spell_corrected_sentence'] = df['text'].apply(lambda x: spell_check(x))

python 中有什么方法可以自动更正单列 excel 文件的多行拼写错误吗？

Is there any way in python to auto-correct spelling mistake in multiple rows of an excel files of a single column?

python

nlp

spell-checking