在 wx.TextCtrl.GetValue() 循环时继续跟踪单词的位置以启用处理重复单词

Keep tracking the position of a word when looping over wx.TextCtrl.GetValue() to enable process repetitive words

我尝试在 wxPython GUI 中制作一个拼写检查器,但我遇到了重复拼写错误单词的问题。该函数需要将单词设置为红色,然后在 MessageDialog 中提出更正建议。关于 MessageDialog 上的更正建议,一切都很好,但我无法完成更改重复拼写错误单词的颜色。 我知道问题是当我得到单词的起始位置时...它一直在考虑单词的第一次出现并忽略其他单词。

for i in range(self.tx_input.GetNumberOfLines()):
            line = self.tx_input.GetLineText(i)
            for word in text:
                if word in line and word not in suggestion:
                    startPos = self.tx_input.GetValue().find(word)
                    endPos = startPos + len(word)
                    self.tx_input.SetStyle(startPos, endPos, wx.TextAttr("red", "white")) 

因为我是在多行循环,所以如果拼写错误的单词发生在同一行中,我会理解这个问题,但最奇怪的是,当重复出现在不同的行中时,它也会失败。

我需要帮助来弄清楚如何跟踪已被处理以在最终新出现时忽略的单词位置。


完整代码

class AddQuestion ( wx.Frame ):

    def __init__(self,parent):

        wx.Frame.__init__(self,parent,id=wx.ID_ANY, title='Grammar Checker', pos=wx.DefaultPosition, size=wx.Size(350,350), style=wx.DEFAULT_FRAME_STYLE)
        self.SetBackgroundColour( wx.Colour( 0, 93, 126 ))
        panel = wx.Panel(self)
        mainBox = wx.BoxSizer(wx.VERTICAL)
        panel.SetSizer(mainBox)

        self.tx_input = wx.TextCtrl(panel, wx.ID_ANY, wx.EmptyString, wx.DefaultPosition, wx.Size(350,150), wx.TE_MULTILINE|wx.TE_RICH2)
        mainBox.Add( self.tx_input, 1, wx.ALL|wx.EXPAND, 5 )

        btnBox = wx.BoxSizer(wx.VERTICAL)
        self.btn = wx.Button(panel, wx.ID_ANY,u'Grammar Check', wx.DefaultPosition, wx.DefaultSize,0)
        btnBox.Add(self.btn,0,wx.ALL,5)

        mainBox.Add(btnBox,0,wx.ALL|wx.ALIGN_CENTER,5)

        self.btn.Bind(wx.EVT_BUTTON, self.grammarCheck)

    def warn(self, parent, message, caption = 'WARNING!'):
        dlg = wx.MessageDialog(parent, message, caption, wx.OK | wx.ICON_WARNING)
        dlg.ShowModal()
        dlg.Destroy()

    def grammarCheck(self, event):

        from symspellpy import SymSpell

        sym_spell = SymSpell()
        sym_spell.load_dictionary("frequency_dictionary_en_82_765.txt", 0, 1)
        input_term = self.tx_input.GetValue().lower()
        # filter unecessary character
        ignore = r'!"#$%&\'()*+,-./:;<=>?@[\]^_`{|}~1234567890“”–'
        text = sym_spell.word_segmentation(input_term.translate(str.maketrans("","",ignore)), max_edit_distance=2 ).segmented_string.split()
        suggestion = sym_spell.word_segmentation(input_term.translate(str.maketrans("","",ignore)), max_edit_distance=2 ).corrected_string.split()

        for i in range(self.tx_input.GetNumberOfLines()):
            line = self.tx_input.GetLineText(i)
            for word in text:
                if word in line and word not in suggestion:
                    startPos = self.tx_input.GetValue().find(word)
                    endPos = startPos + len(word)
                    self.tx_input.SetStyle(startPos, endPos, wx.TextAttr("red", "white"))

        for i in range(len(text)):
            if text[i] != suggestion[i]:
                self.warn(self, "`" + text[i] + "`" + " did you mean " + "`" + suggestion[i] + "` ?")


if __name__ == '__main__':

        app = wx.App(False)
        frame = AddQuestion(None)
        frame.Show()

        app.MainLoop() 

您一直在使用 string.find(word)
查找第一个匹配项 查找类似 "python find all occurrences of sub-string" 的内容,我发现:

# using list comprehension + startswith() 
# All occurrences of substring in string  
res = [i for i in range(len(test_str)) if test_str.startswith(test_sub, i)] 

import re

# using re.finditer() 
# All occurrences of substring in string  
res = [i.start() for i in re.finditer(test_sub, test_str)] 

两者都是 return 测试字符串中的起始位置列表,可以根据您的代码进行调整