VBA 将唯一的正则表达式附加到字符串变量
VBA Append unique regular expressions to string variable
如何从字符串中获取匹配的正则表达式,删除重复项,并将它们附加到以逗号分隔的字符串变量?
例如,在字符串“这是所需正则表达式的示例:BPOI-G8J7R9、BPOI-G8J7R9 和 BPOI-E5Q8D2”中,所需的输出字符串将为“BPOI-G8J7R9,BPOI-E5Q8D2”
我试图使用字典删除重复项,但我的函数吐出了可怕的#Value 错误。
谁能看出我哪里出错了?或者有什么更好的方法来完成这项任务的建议吗?
代码如下:
Public Function extractexpressions(ByVal text As String) As String
Dim regex, expressions, expressions_dict As Object, result As String, found_expressions As Variant, i As Long
Set regex = CreateObject("VBScript.RegExp")
regex.Pattern = "[A-Z][A-Z][A-Z][A-Z][-]\w\w\w\w\w\w"
regex.Global = True
Set expressions_dict = CreateObject("Scripting.Dictionary")
If regex.Test(text) Then
expressions = regex.Execute(text)
End If
For Each item In expressions
If Not expressions_dict.exists(item) Then expressions_dict.Add item, 1
Next
found_expressions = expressions_dict.items
result = ""
For i = 1 To expressions_dict.Count - 1
result = result & found_expressions(i) & ","
Next i
extractexpressions = result
End Function
如果您从 Sub 调用您的函数,您将能够对其进行调试。
请参阅下面关于将匹配作为键添加到字典中的评论 - 如果您添加匹配对象本身,而不是显式指定匹配的 value
属性,您的字典将不会 de-duplicate 你的比赛(因为两个或更多 match
具有相同 value
的对象仍然是不同的对象)。
Sub Tester()
Debug.Print extractexpressions("ABCD-999999 and DFRG-123456 also ABCD-999999 blah")
End Sub
Public Function extractexpressions(ByVal text As String) As String
Dim regex As Object, expressions As Object, expressions_dict As Object
Dim item
Set regex = CreateObject("VBScript.RegExp")
regex.Pattern = "[A-Z]{4}-\w{6}"
regex.Global = True
If regex.Test(text) Then
Set expressions = regex.Execute(text)
Set expressions_dict = CreateObject("Scripting.Dictionary")
For Each item In expressions
'A dictionary can have object-type keys, so make sure to add the match *value*
' and the not match object itself
If Not expressions_dict.Exists(item.Value) Then expressions_dict.Add item.Value, 1
Next
extractexpressions = Join(expressions_dict.Keys, ",")
End If
End Function
VBA 的正则表达式对象实际上支持对先前捕获组的反向引用。因此我们可以通过表达式本身获得所有唯一项:
([A-Z]{4}-\w{6})(?!.*)
在线查看demo
要付诸实践:
Sub Test()
Debug.Print extractexpressions("this is an example of the desired regular expressions: BPOI-G8J7R9, BPOI-G8J7R9 and BPOI-E5Q8D2")
End Sub
Public Function extractexpressions(ByVal text As String) As String
With CreateObject("VBScript.RegExp")
.Pattern = "([A-Z]{4}-\w{6})(?!.*)|."
.Global = True
extractexpressions = Replace(Application.Trim(.Replace(text, " ")), " ", ",")
End With
End Function
打印:
如何从字符串中获取匹配的正则表达式,删除重复项,并将它们附加到以逗号分隔的字符串变量?
例如,在字符串“这是所需正则表达式的示例:BPOI-G8J7R9、BPOI-G8J7R9 和 BPOI-E5Q8D2”中,所需的输出字符串将为“BPOI-G8J7R9,BPOI-E5Q8D2”
我试图使用字典删除重复项,但我的函数吐出了可怕的#Value 错误。
谁能看出我哪里出错了?或者有什么更好的方法来完成这项任务的建议吗?
代码如下:
Public Function extractexpressions(ByVal text As String) As String
Dim regex, expressions, expressions_dict As Object, result As String, found_expressions As Variant, i As Long
Set regex = CreateObject("VBScript.RegExp")
regex.Pattern = "[A-Z][A-Z][A-Z][A-Z][-]\w\w\w\w\w\w"
regex.Global = True
Set expressions_dict = CreateObject("Scripting.Dictionary")
If regex.Test(text) Then
expressions = regex.Execute(text)
End If
For Each item In expressions
If Not expressions_dict.exists(item) Then expressions_dict.Add item, 1
Next
found_expressions = expressions_dict.items
result = ""
For i = 1 To expressions_dict.Count - 1
result = result & found_expressions(i) & ","
Next i
extractexpressions = result
End Function
如果您从 Sub 调用您的函数,您将能够对其进行调试。
请参阅下面关于将匹配作为键添加到字典中的评论 - 如果您添加匹配对象本身,而不是显式指定匹配的 value
属性,您的字典将不会 de-duplicate 你的比赛(因为两个或更多 match
具有相同 value
的对象仍然是不同的对象)。
Sub Tester()
Debug.Print extractexpressions("ABCD-999999 and DFRG-123456 also ABCD-999999 blah")
End Sub
Public Function extractexpressions(ByVal text As String) As String
Dim regex As Object, expressions As Object, expressions_dict As Object
Dim item
Set regex = CreateObject("VBScript.RegExp")
regex.Pattern = "[A-Z]{4}-\w{6}"
regex.Global = True
If regex.Test(text) Then
Set expressions = regex.Execute(text)
Set expressions_dict = CreateObject("Scripting.Dictionary")
For Each item In expressions
'A dictionary can have object-type keys, so make sure to add the match *value*
' and the not match object itself
If Not expressions_dict.Exists(item.Value) Then expressions_dict.Add item.Value, 1
Next
extractexpressions = Join(expressions_dict.Keys, ",")
End If
End Function
VBA 的正则表达式对象实际上支持对先前捕获组的反向引用。因此我们可以通过表达式本身获得所有唯一项:
([A-Z]{4}-\w{6})(?!.*)
在线查看demo
要付诸实践:
Sub Test()
Debug.Print extractexpressions("this is an example of the desired regular expressions: BPOI-G8J7R9, BPOI-G8J7R9 and BPOI-E5Q8D2")
End Sub
Public Function extractexpressions(ByVal text As String) As String
With CreateObject("VBScript.RegExp")
.Pattern = "([A-Z]{4}-\w{6})(?!.*)|."
.Global = True
extractexpressions = Replace(Application.Trim(.Replace(text, " ")), " ", ",")
End With
End Function
打印: