使用 NSRegularExpression 排除某些匹配项

Exclude certain matches using NSRegularExpression

我正在关注 http://www.raywenderlich.com/86205/nsregularexpression-swift-tutorial 并使用下面的 playground 文件:

http://cdn5.raywenderlich.com/wp-content/uploads/2015/01/iRegex-Playground-Xcode-6.3.zip

帮助查找匹配项,但我需要能够排除某些结果。

基本上我正在查看以下模式:

let thenotClasses = "*121:32,  Malachi 22:66 , 32:434, 16:111 , 17:11 , John 13:14, Verse 41:29, Great 71:21"

listMatches("\d\d?\d?:\d\d?\d?", inString: thenotClasses)

我得到了所有 number:number 匹配项,但是,我真正想做的是也告诉它排除任何以“*”为前缀的匹配项或以 [=28= 一词开头的匹配项] 或 "John " 但包括其余的

所以在这种情况下,我希望匹配 return:

[32:434, 16:111 , 17:11 , 41:29  and 71:21]

任何帮助将不胜感激,愿上帝保佑:)

当前面有某些单词时使匹配无效的正则表达式模式很难编写,主要是因为正则表达式引擎是贪婪的,所以它可以从下一个数字开始。

如果您使用负面回顾:

(?<!\*|Malachi |John )(\d+:\d+)

这意味着 "match digits not preceded by *, Malachi or John" 匹配将从下一位开始。例如在 Malachi 22:66 中,它将捕获 2:66.

我见过的使用正则表达式的最常见陷阱是将所有内容委托给正则表达式引擎。它确实很强大,但你忘了你还有更灵活的编程语言来调用正则表达式。

这里有一个将两者混合在一起的想法:捕获任何 number:number 并检查它之前发生的事情。排除前面有 *MalachiJohn.

的匹配项

模式:

(\*|Malachi |John )?(\d+:\d+)

(\*|Malachi |John ) - match a *, Malachi or John and put it into capture group 1
?                   - make the first capture group optional
(\d+:\d+)           - match the verse and put it into capture group 2

代码:

let str = "*121:32,  Malachi 22:66 , 32:434, 16:111 , 17:11 , John 13:14, Verse 41:29, Great 71:21"
let s = str as NSString  // NSString is easier to work with Regex

let regex = try! NSRegularExpression(pattern: "(\*|Malachi |John )?(\d+:\d+)", options: [])
var verses = [String]()

regex.enumerateMatchesInString(str, options: [], range: NSMakeRange(0, str.characters.count)) { result, flags, stop in
   // Check that the first capture group is not found. Otherwise, return
    guard let result = result where result.rangeAtIndex(1).location == NSNotFound else {
        return
    }

    // When the first capture group is not found, add the second capture, group
    // i.e. the verse number, to the result list
    verses.append(s.substringWithRange(result.rangeAtIndex(2)))
}

print(verses)