VBA: 如何从数据中删除不可打印的字符

VBA: How to remove non-printable characters from data

我需要以编程方式删除不可打印的字符,例如:

制表符-字符(9) 换行符 - char(10) 马车 return - 字符 (13) 数据 link 转义 - char(16)

我启动了一个通用函数,该函数将从 ms 访问表单字段的 lost_focus 事件中调用。

我还没有弄清楚如何识别字符串何时包含不需要的字符。

    Function RemoveNonPrintableCharacters(ByVal TextData) As String

        Dim dirtyString As String
        Dim cleanString As String
        Dim iPosition As Integer

        If IsNull(TextData) Then
            Exit Function
        End If

        dirtyString = TextData
        cleanString = ""

        For iPosition = 1 To Len(dirtyString)
            Select Case Asc(Mid(dirtyString, iPosition, 1))
                Case 9    ' Char(9)
                Case 10   ' Char(10)
                Case 13   ' Char(13)
                Case 16   ' Char(16)
                Case Else ' Add character to clean field.
                    cleanString = cleanString & Mid(dirtyString, iPosition, 1)
            End Select
        Next

        RemoveNonPrintableCharacters = cleanString

    End Function

这些是我在测试时一直使用的 2 个字符串:

This line,    has       multiple,     tabs       that   need to be removed


This line, has multiple,     
line
breaks
that
need to be removed

This line,    has       multiple,     tabs       that   need to be removed
And
Also contains
multiple,     
line
breaks
that
need to be  removed
    Function RemoveNonPrintableCharacters(ByVal TextData) As String

    Dim dirtyString As String
    Dim cleanString As String
    Dim iPosition As Integer

    If IsNull(TextData) Then
        Exit Function
    End If

    dirtyString = TextData
    cleanString = ""

    For iPosition = 1 To Len(dirtyString)
        Select Case Asc(Mid(dirtyString, iPosition, 1))
            Case 9, 10, 13, 16
                cleanString = cleanString & " "
                Case Else
                cleanString = cleanString & Mid(dirtyString, iPosition, 1)
        End Select
    Next

    RemoveNonPrintableCharacters = cleanString

End Function
A = Chr(09) & "Cat" & Chr(10) & vbcrlf

A = Replace(A, Chr(10))
A = Replace(A, Chr(13))
A = Replace(A, Chr(09))

Msgbox A

这是人们通常的做法。

您的代码创建了大量隐式变量。

'首先你需要找到一个角色

你的Str = "Bla bla bla..."

如果 instr(YourStr, chr(10)) > 0 那么

NewStr = Replace(YourStr, Chr(10),"")

如果

结束

我正在用 space 个字符 chr(32) 替换 non-printable 个字符,但您可以根据需要更改它。

Function RemoveNonPrintableCharacters(ByVal TextData) As String
Dim sClean$

sClean = Replace(TextData, Chr(9), Chr(32))
sClean = Replace(sClean, Chr(10), Chr(32))
sClean = Replace(sClean, Chr(13), Chr(32))
sClean = Replace(sClean, Chr(16), Chr(32))

RemoveNonPrintableCharacters = sClean

End Function

这仅用于删除字符串右侧的 Non-printing 个字符,而不用空格替换这些字符。

Function fRemoveNonPrintableCharacters(ByVal TextData) As String
Dim dirtyString As String
Dim cleanString As String
Dim iPosition As Integer

If IsNull(TextData) Then
    Exit Function
End If

dirtyString = TextData
cleanString = ""

For iPosition = Len(dirtyString) To 1 Step -1
    Select Case Asc(Mid(dirtyString, iPosition, 1))
        Case 9, 10, 13, 16, 32, 160
            cleanString = cleanString
            Case Else
            cleanString = Left(dirtyString, iPosition)
            Exit For

    End Select
Next

fRemoveNonPrintableCharacters = cleanString

End Function

似乎这应该更简单,使用 Excel 清理功能。以下也适用:

myString = Worksheets("Sheet1").Range("A" & tRow).Value 
myString = Application.WorksheetFunction.Clean(myString)

您还可以使用其他正常的和自制的 Excel 函数:

myString = Application.WorksheetFunction.Trim(myString)

仍然没有让 Substitute 函数以这种方式工作,但我正在努力。

这是我搜索要使用的快速函数时排名靠前的 google 结果,我有一个很好的旧 google 但没有真正完全解决我的问题。

主要问题是即使没有问题,所有这些函数都会触及原始字符串。这会减慢速度。

我重写了它,以便仅在出现错误字符时进行修正,还扩展到所有不可打印的字符和超出标准 ascii 的字符。

Public Function Clean_NonPrintableCharacters(Str As String) As String

    'Removes non-printable characters from a string

    Dim cleanString As String
    Dim i As Integer

    cleanString = Str

    For i = Len(cleanString) To 1 Step -1
        'Debug.Print Asc(Mid(Str, i, 1))

        Select Case Asc(Mid(Str, i, 1))
            Case 1 To 31, Is >= 127
                'Bad stuff
                'https://www.ionos.com/digitalguide/server/know-how/ascii-codes-overview-of-all-characters-on-the-ascii-table/
                cleanString = Left(cleanString, i - 1) & Mid(cleanString, i + 1)

            Case Else
                'Keep

        End Select
    Next i

    Clean_NonPrintableCharacters = cleanString

End Function

当出现Unicode字符时,应修改此处显示的代码。我的提案包含程序无法识别的字符:

Public Function Clean_NonPrintableCharacters(Str As String) As String

    'Removes non-printable characters from a string

    Dim cleanString As String
    Dim i As Integer

    cleanString = Str
    
    For i = Len(cleanString) To 1 Step -1

        If Chr(Asc(Mid(cleanString, i, 1))) <> Mid(cleanString, i, 1) Then
        cleanString = Left(cleanString, i - 1) & Mid(cleanString, i + 1)
        End If
        
    Next i

    Clean_NonPrintableCharacters = WorksheetFunction.Clean(cleanString)

End Function  

可以通过RegEx解决(在Tools - References in VBE中添加MS VBScript Regular Expression):

Function NormalString(text As String, Optional filler = vbNullString) As String
    Dim re As New RegExp
    With re
        .Pattern = "([\x00-\x1F\xA0])"
        .Global = True
        text = .Replace(text, filler)
    End With
    NormalString = text
End Function

如果有特殊字符,则用空格替换

If InStr(TextData, Chr(9)) > 0 Then TextData = Replace(TextData, Chr(9), Chr(32))
If InStr(TextData, Chr(10)) > 0 Then TextData = Replace(TextData, Chr(10), Chr(32))
If InStr(TextData, Chr(13)) > 0 Then TextData = Replace(TextData, Chr(13), Chr(32))
If InStr(TextData, Chr(16)) > 0 Then TextData = Replace(TextData, Chr(16), Chr(32))