Swift 字符串分词器/解析器

Swift String Tokenizer / Parser

你好 Swift 开发者!

我是一名初级开发人员,我正在尝试找出一种最好的方法来标记/解析 Swift 字符串作为练习。

我有一个字符串,如下所示:

let string = "This is a {B}string{/B} and this is a substring."

我想做的是,标记字符串,并更改您看到的标签内的“字符串/标记”。

我可以看到使用 NSRegularExpression 并且它是 matches,但感觉太笼统了。我只想说这些标签中的 2 个,它们会更改文本。 Swift 5.2^ 中最好的方法是什么?

        if let regex = try? NSRegularExpression(pattern: "\{[a-z0-9]+\}", options: .caseInsensitive) {
            let string = self as NSString
            return regex.matches(in: self, options: [], range: NSRange(location: 0, length: string.length)).map {
                // now [=12=] is the result? but it won't work for enclosing the tags :/ 
            }
        }

不确定你是否已经用 NLTokenizer 解决了它,但你肯定可以用 Regx 解决它这是如何(我已经将它实现为通用的,如果你将来需要的话处理不同种类的标签并为它们替换不同的字符串 对逻辑进行微调应该可以完成工作)

   override func viewDidLoad() {
        super.viewDidLoad()

        let regexStr = "(\{B\}(\s*\w+\s*)*\{\/B\})"
        let regex = try! NSRegularExpression(pattern: regexStr)
        var string = "Sandeep {B}Bhandaari{/B} is here{B}Sandeep{/B}"
        var foundRanges = [NSRange]()

        regex.enumerateMatches(in: string, options: [], range: NSMakeRange(0, string.count)) { (match, flag, stop) in
            if let matchRange = match?.range(at: 1) {
                foundRanges.append(matchRange)
            }
        }

        let substituteString = "abcd"
        var replacedString = string as NSString
        let foundRangesCount = foundRanges.count
        var currentRange = 0

        while foundRangesCount > currentRange {
            let range = foundRanges[currentRange]
            replacedString = replacedString.replacingCharacters(in: range, with: substituteString) as NSString
            reEvaluateAllRanges(ranges: &foundRanges, byOffset: range.length - substituteString.count)
            currentRange += 1
        }

        debugPrint(replacedString)
    }
    
   func reEvaluateAllRanges(ranges: inout [NSRange], byOffset: Int) {
        var newFoundRange = [NSRange]()
        for range in ranges {
            newFoundRange.append(NSMakeRange(range.location - byOffset, range.length))
        }
        ranges = newFoundRange
    }

Input: "Sandeep {B}Bhandaari{/B} is here"
Output: Sandeep abcd is here

Input: "Sandeep {B}Bhandaari{/B} is here{B}Sandeep{/B}"
Output: Sandeep abcd is hereabcd

查看边缘情况处理 较长的字符串被较小的替代字符串替换,反之亦然 检测包含在有/没有 space

的标签中的字符串

编辑 1: Regx (\{B\}(\s*\w+\s*)*\{\/B\}) 应该是不言自明的,如果您需要帮助理解它,请使用 cheat sheet

regex.enumerateMatches(in: string, options: [], range: NSMakeRange(0, string.count)) { (match, flag, stop) in
            if let matchRange = match?.range(at: 1) {
                foundRanges.append(matchRange)
            }
        }

我本可以在这里修改子字符串本身,但是如果你有多个匹配项并且如果你改变字符串,评估的范围将被破坏因此我将所有找到的范围保存到一个数组中并稍后对每个范围应用替换

        let substituteString = "abcd"
        var replacedString = string as NSString
        let foundRangesCount = foundRanges.count
        var currentRange = 0

        while foundRangesCount > currentRange {
            let range = foundRanges[currentRange]
            replacedString = replacedString.replacingCharacters(in: range, with: substituteString) as NSString
            reEvaluateAllRanges(ranges: &foundRanges, byOffset: range.length - substituteString.count)
            currentRange += 1
        }

此处将遍历所有找到的匹配范围并用替换字符串替换范围内的字符,您始终可以在 while 循环中使用 switch / if else 梯形图来查找不同类型的标签并为每个标签传递不同的替换字符串

func reEvaluateAllRanges(ranges: inout [NSRange], byOffset: Int) {
        var newFoundRange = [NSRange]()
        for range in ranges {
            newFoundRange.append(NSMakeRange(range.location - byOffset, range.length))
        }
        ranges = newFoundRange
    }

此函数使用偏移量修改数组中的所有范围,请记住您只需要修改范围的位置,长度保持不变

您可以做的一点优化可能是从您已经应用替换字符串的数组中删除范围

如果可以接受使用 html 标签而不是 {B}{/B} 的选项,那么您可以使用我为此目的编写的 StringEx 库。

您可以 select html 标签内的子字符串并将其替换为另一个字符串,如下所示:

let string = "This is a <b>string</b> and this is a substring."
let ex = string.ex

ex[.tag("b")].replace(with: "some value")

print(ex.rawString) // This is a <b>some value</b> and this is a substring.
print(ex.string) // This is a some value and this is a substring.

如有必要,您还可以设置 selected 子字符串的样式并得到 NSAttributedString:

ex[.tag("b")].style([
    .font(.boldSystemFont(ofSize: 16)),
    .color(.black)
])

myLabel.attributedText = ex.attributedString