使用 UDF 从字符串中删除某些字符
Remove Certain Characters from a String using UDF
我有一列包含具有如下字母数字系统列表的单元格:
4A(4,5,6,7,8,9); 4B(4,5,7,8); 3A(1,2,3); 3B(1,2,3), 3C(1,2)
在它旁边的单元格上,我使用 UDF 函数去除特殊字符“(),;”为了将数据保留为
4A456789 4B4578 3A123 3B123 3C12
Function RemoveSpecial(Str As String) As String
Dim SpecialChars As String
Dim i As Long
SpecialChars = "(),;-abcdefghijklmnopqrstuvwxyz"
For i = 1 To Len(SpecialChars)
Str = Replace$(Str, Mid$(SpecialChars, i, 1), "")
Next
RemoveSpecial = Str
End Function
在大多数情况下,这很有效。但是,在某些情况下,单元格会包含非正统的模式,例如当 space 包含在 4A 和带括号的项目之间时:
4A (4,5,6,7,8,9);
或者当括号内出现文字时(包括每边两个space):
4A (4,5, 跳过 8,9);
或前两个字符之间出现space:
4 A(4,5,6)
您将如何解决此问题,以便删除随机 space,除非分层实际数据组合?
一种策略是在消除“特殊”字符之前替换您想要保留的模式,然后恢复所需的模式。
从您的示例数据来看,您似乎只想保留 space,前提是它遵循 );
或 ),
像这样:
Function RemoveSpecial(Data As Variant) As Variant
Dim SpecialChars As String
Dim KeepStr As Variant, PlaceHolder As Variant, ReplaceStr As Variant
Dim i As Long
Dim DataStr As String
SpecialChars = " (),;-abcdefghijklmnopqrstuvwxyz"
KeepStr = Array("); ", "), ")
PlaceHolder = Array("~0~", "~1~") ' choose a PlaceHolder that won't appear in the data
ReplaceStr = Array(" ", " ")
DataStr = Data
For i = LBound(KeepStr) To UBound(KeepStr)
DataStr = Replace$(DataStr, KeepStr(i), PlaceHolder(i))
Next
For i = 1 To Len(SpecialChars)
DataStr = Replace$(DataStr, Mid$(SpecialChars, i, 1), vbNullString)
Next
For i = LBound(KeepStr) To UBound(KeepStr)
DataStr = Replace$(DataStr, PlaceHolder(i), ReplaceStr(i))
Next
RemoveSpecial = Application.Trim(DataStr)
End Function
另一种策略是正则表达式 (RegEx
)
看来正则表达式在这里可以派上用场,例如:
Function RemoveSpecial(Str As String) As String
With CreateObject("vbscript.regexp")
.Global = True
.Pattern = "\)[;,]( )|[^A-Z\d]+"
RemoveSpecial = .Replace(Str, "")
End With
End Function
我用过正则表达式:
\)[;,]( )|[^A-Z\d]+
可以看到一个在线demo to see the result in your browser. The way this works is to apply a form of what some would call "The best regex trick ever!"
\)[;,]( )
- 转义结束括号,然后在我们 捕获 第一个捕获组中的 space 字符之前匹配逗号或分号。
|
- 或者使用以下交替:
[^A-Z\d]+
- 除了给定字符 class. 之外的任何 1+ 个字符
编辑:
如果您有 4A;
或 4A,
这样的值,您可以使用:
(?:([A-Z])|\))[;,]( )|[^A-Z\d]+
并替换为
。网上看到一个demo.
我有一列包含具有如下字母数字系统列表的单元格:
4A(4,5,6,7,8,9); 4B(4,5,7,8); 3A(1,2,3); 3B(1,2,3), 3C(1,2)
在它旁边的单元格上,我使用 UDF 函数去除特殊字符“(),;”为了将数据保留为
4A456789 4B4578 3A123 3B123 3C12
Function RemoveSpecial(Str As String) As String
Dim SpecialChars As String
Dim i As Long
SpecialChars = "(),;-abcdefghijklmnopqrstuvwxyz"
For i = 1 To Len(SpecialChars)
Str = Replace$(Str, Mid$(SpecialChars, i, 1), "")
Next
RemoveSpecial = Str
End Function
在大多数情况下,这很有效。但是,在某些情况下,单元格会包含非正统的模式,例如当 space 包含在 4A 和带括号的项目之间时:
4A (4,5,6,7,8,9);
或者当括号内出现文字时(包括每边两个space):
4A (4,5, 跳过 8,9);
或前两个字符之间出现space:
4 A(4,5,6)
您将如何解决此问题,以便删除随机 space,除非分层实际数据组合?
一种策略是在消除“特殊”字符之前替换您想要保留的模式,然后恢复所需的模式。
从您的示例数据来看,您似乎只想保留 space,前提是它遵循 );
或 ),
像这样:
Function RemoveSpecial(Data As Variant) As Variant
Dim SpecialChars As String
Dim KeepStr As Variant, PlaceHolder As Variant, ReplaceStr As Variant
Dim i As Long
Dim DataStr As String
SpecialChars = " (),;-abcdefghijklmnopqrstuvwxyz"
KeepStr = Array("); ", "), ")
PlaceHolder = Array("~0~", "~1~") ' choose a PlaceHolder that won't appear in the data
ReplaceStr = Array(" ", " ")
DataStr = Data
For i = LBound(KeepStr) To UBound(KeepStr)
DataStr = Replace$(DataStr, KeepStr(i), PlaceHolder(i))
Next
For i = 1 To Len(SpecialChars)
DataStr = Replace$(DataStr, Mid$(SpecialChars, i, 1), vbNullString)
Next
For i = LBound(KeepStr) To UBound(KeepStr)
DataStr = Replace$(DataStr, PlaceHolder(i), ReplaceStr(i))
Next
RemoveSpecial = Application.Trim(DataStr)
End Function
另一种策略是正则表达式 (RegEx
)
看来正则表达式在这里可以派上用场,例如:
Function RemoveSpecial(Str As String) As String
With CreateObject("vbscript.regexp")
.Global = True
.Pattern = "\)[;,]( )|[^A-Z\d]+"
RemoveSpecial = .Replace(Str, "")
End With
End Function
我用过正则表达式:
\)[;,]( )|[^A-Z\d]+
可以看到一个在线demo to see the result in your browser. The way this works is to apply a form of what some would call "The best regex trick ever!"
\)[;,]( )
- 转义结束括号,然后在我们 捕获 第一个捕获组中的 space 字符之前匹配逗号或分号。|
- 或者使用以下交替:[^A-Z\d]+
- 除了给定字符 class. 之外的任何 1+ 个字符
编辑:
如果您有 4A;
或 4A,
这样的值,您可以使用:
(?:([A-Z])|\))[;,]( )|[^A-Z\d]+
并替换为。网上看到一个demo.