使用 UDF 从字符串中删除某些字符

Remove Certain Characters from a String using UDF

我有一列包含具有如下字母数字系统列表的单元格:

4A(4,5,6,7,8,9); 4B(4,5,7,8); 3A(1,2,3); 3B(1,2,3), 3C(1,2)

在它旁边的单元格上,我使用 UDF 函数去除特殊字符“(),;”为了将数据保留为

4A456789 4B4578 3A123 3B123 3C12

Function RemoveSpecial(Str As String) As String
    Dim SpecialChars As String
    Dim i As Long
    SpecialChars = "(),;-abcdefghijklmnopqrstuvwxyz"
    For i = 1 To Len(SpecialChars)
        Str = Replace$(Str, Mid$(SpecialChars, i, 1), "")
    Next
    RemoveSpecial = Str
End Function

在大多数情况下,这很有效。但是,在某些情况下,单元格会包含非正统的模式,例如当 space 包含在 4A 和带括号的项目之间时:

4A (4,5,6,7,8,9);

或者当括号内出现文字时(包括每边两个space):

4A (4,5, 跳过 8,9);

或前两个字符之间出现space:

4 A(4,5,6)

您将如何解决此问题,以便删除随机 space,除非分层实际数据组合?

一种策略是在消除“特殊”字符之前替换您想要保留的模式,然后恢复所需的模式。

从您的示例数据来看,您似乎只想保留 space,前提是它遵循 );),

像这样:

Function RemoveSpecial(Data As Variant) As Variant
    Dim SpecialChars As String
    Dim KeepStr As Variant, PlaceHolder As Variant, ReplaceStr As Variant
    Dim i As Long
    Dim DataStr As String
    
    SpecialChars = " (),;-abcdefghijklmnopqrstuvwxyz"
    KeepStr = Array("); ", "), ")
    PlaceHolder = Array("~0~", "~1~") ' choose a PlaceHolder that won't appear in the data
    ReplaceStr = Array(" ", " ")
    DataStr = Data
    For i = LBound(KeepStr) To UBound(KeepStr)
        DataStr = Replace$(DataStr, KeepStr(i), PlaceHolder(i))
    Next
    For i = 1 To Len(SpecialChars)
        DataStr = Replace$(DataStr, Mid$(SpecialChars, i, 1), vbNullString)
    Next
    For i = LBound(KeepStr) To UBound(KeepStr)
        DataStr = Replace$(DataStr, PlaceHolder(i), ReplaceStr(i))
    Next
    RemoveSpecial = Application.Trim(DataStr)
End Function

另一种策略是正则表达式 (RegEx)

看来正则表达式在这里可以派上用场,例如:

Function RemoveSpecial(Str As String) As String
    
    With CreateObject("vbscript.regexp")
        .Global = True
        .Pattern = "\)[;,]( )|[^A-Z\d]+"
        RemoveSpecial = .Replace(Str, "")
    End With

End Function

我用过正则表达式:

\)[;,]( )|[^A-Z\d]+

可以看到一个在线demo to see the result in your browser. The way this works is to apply a form of what some would call "The best regex trick ever!"

  • \)[;,]( ) - 转义结束括号,然后在我们 捕获 第一个捕获组中的 space 字符之前匹配逗号或分号。
  • | - 或者使用以下交替:
  • [^A-Z\d]+ - 除了给定字符 class.
  • 之外的任何 1+ 个字符


编辑:

如果您有 4A;4A, 这样的值,您可以使用:

(?:([A-Z])|\))[;,]( )|[^A-Z\d]+

并替换为。网上看到一个demo.