Swift:如何识别和删除字符串中的介词
Swift: How to identify and delete prepositions in a string
我试图在用户条目中识别要搜索的关键字,所以我想过滤掉一些词性以便提取要在我的数据库中查询的关键字。
目前我使用下面的代码替换字符串中的单词“of”
let rawString = "I’m jealous of my parents. I’ll never have a kid as cool as theirs, one who is smart, has devilishly good looks, and knows all sorts of funny phrases."
var filtered = self.rawString.replacingOccurrences(of: "of", with: "")
我现在想做的是扩展它以替换字符串中的所有介词。
我想做的是创建一个巨大的已知介词列表,例如
let prepositions = ["in","through","after","under","beneath","before"......]
然后用白色 space 和
分割字符串
var WordList : [String] = filtered.components(separatedBy: " ")
然后遍历单词列表以找到介词匹配并将其删除。创建列表会很丑陋,而且对我的代码来说可能效率不高。
从字符串中检测和删除介词的最佳方法是什么?
var newString = rawString
.split(separator: " ")
.filter{ !prepositions.contains(String([=10=]))}
.joined(separator: " ")
使用NaturalLanguage
:
import NaturalLanguage
let text = "The ripe taste of cheese improves with age."
let tagger = NLTagger(tagSchemes: [.lexicalClass])
tagger.string = text
let options: NLTagger.Options = [.omitPunctuation, .omitWhitespace]
var newSentence = [String]()
tagger.enumerateTags(in: text.startIndex..<text.endIndex, unit: .word, scheme: .lexicalClass, options: options) { tag, tokenRange in
guard let tag = tag, tag != .preposition else { return true }
newSentence.append("\(text[tokenRange])")
return true
}
print("Input: \(text)")
print("Output: \(newSentence.joined(separator: " "))")
这会打印:
Input: The ripe taste of cheese improves with age.
Output: The ripe taste cheese improves age
注意两个介词 of 和 with 被删除了。我的方法还删除了标点符号;您可以使用 .omitPunctuation
选项进行调整。
我试图在用户条目中识别要搜索的关键字,所以我想过滤掉一些词性以便提取要在我的数据库中查询的关键字。 目前我使用下面的代码替换字符串中的单词“of”
let rawString = "I’m jealous of my parents. I’ll never have a kid as cool as theirs, one who is smart, has devilishly good looks, and knows all sorts of funny phrases."
var filtered = self.rawString.replacingOccurrences(of: "of", with: "")
我现在想做的是扩展它以替换字符串中的所有介词。
我想做的是创建一个巨大的已知介词列表,例如
let prepositions = ["in","through","after","under","beneath","before"......]
然后用白色 space 和
分割字符串var WordList : [String] = filtered.components(separatedBy: " ")
然后遍历单词列表以找到介词匹配并将其删除。创建列表会很丑陋,而且对我的代码来说可能效率不高。
从字符串中检测和删除介词的最佳方法是什么?
var newString = rawString
.split(separator: " ")
.filter{ !prepositions.contains(String([=10=]))}
.joined(separator: " ")
使用NaturalLanguage
:
import NaturalLanguage
let text = "The ripe taste of cheese improves with age."
let tagger = NLTagger(tagSchemes: [.lexicalClass])
tagger.string = text
let options: NLTagger.Options = [.omitPunctuation, .omitWhitespace]
var newSentence = [String]()
tagger.enumerateTags(in: text.startIndex..<text.endIndex, unit: .word, scheme: .lexicalClass, options: options) { tag, tokenRange in
guard let tag = tag, tag != .preposition else { return true }
newSentence.append("\(text[tokenRange])")
return true
}
print("Input: \(text)")
print("Output: \(newSentence.joined(separator: " "))")
这会打印:
Input: The ripe taste of cheese improves with age.
Output: The ripe taste cheese improves age
注意两个介词 of 和 with 被删除了。我的方法还删除了标点符号;您可以使用 .omitPunctuation
选项进行调整。