VBA: 如何从数据中删除不可打印的字符
VBA: How to remove non-printable characters from data
我需要以编程方式删除不可打印的字符,例如:
制表符-字符(9)
换行符 - char(10)
马车 return - 字符 (13)
数据 link 转义 - char(16)
我启动了一个通用函数,该函数将从 ms 访问表单字段的 lost_focus 事件中调用。
我还没有弄清楚如何识别字符串何时包含不需要的字符。
Function RemoveNonPrintableCharacters(ByVal TextData) As String
Dim dirtyString As String
Dim cleanString As String
Dim iPosition As Integer
If IsNull(TextData) Then
Exit Function
End If
dirtyString = TextData
cleanString = ""
For iPosition = 1 To Len(dirtyString)
Select Case Asc(Mid(dirtyString, iPosition, 1))
Case 9 ' Char(9)
Case 10 ' Char(10)
Case 13 ' Char(13)
Case 16 ' Char(16)
Case Else ' Add character to clean field.
cleanString = cleanString & Mid(dirtyString, iPosition, 1)
End Select
Next
RemoveNonPrintableCharacters = cleanString
End Function
这些是我在测试时一直使用的 2 个字符串:
This line, has multiple, tabs that need to be removed
This line, has multiple,
line
breaks
that
need to be removed
This line, has multiple, tabs that need to be removed
And
Also contains
multiple,
line
breaks
that
need to be removed
Function RemoveNonPrintableCharacters(ByVal TextData) As String
Dim dirtyString As String
Dim cleanString As String
Dim iPosition As Integer
If IsNull(TextData) Then
Exit Function
End If
dirtyString = TextData
cleanString = ""
For iPosition = 1 To Len(dirtyString)
Select Case Asc(Mid(dirtyString, iPosition, 1))
Case 9, 10, 13, 16
cleanString = cleanString & " "
Case Else
cleanString = cleanString & Mid(dirtyString, iPosition, 1)
End Select
Next
RemoveNonPrintableCharacters = cleanString
End Function
A = Chr(09) & "Cat" & Chr(10) & vbcrlf
A = Replace(A, Chr(10))
A = Replace(A, Chr(13))
A = Replace(A, Chr(09))
Msgbox A
这是人们通常的做法。
您的代码创建了大量隐式变量。
'首先你需要找到一个角色
你的Str = "Bla bla bla..."
如果 instr(YourStr, chr(10)) > 0 那么
NewStr = Replace(YourStr, Chr(10),"")
如果
结束
我正在用 space 个字符 chr(32) 替换 non-printable 个字符,但您可以根据需要更改它。
Function RemoveNonPrintableCharacters(ByVal TextData) As String
Dim sClean$
sClean = Replace(TextData, Chr(9), Chr(32))
sClean = Replace(sClean, Chr(10), Chr(32))
sClean = Replace(sClean, Chr(13), Chr(32))
sClean = Replace(sClean, Chr(16), Chr(32))
RemoveNonPrintableCharacters = sClean
End Function
这仅用于删除字符串右侧的 Non-printing 个字符,而不用空格替换这些字符。
Function fRemoveNonPrintableCharacters(ByVal TextData) As String
Dim dirtyString As String
Dim cleanString As String
Dim iPosition As Integer
If IsNull(TextData) Then
Exit Function
End If
dirtyString = TextData
cleanString = ""
For iPosition = Len(dirtyString) To 1 Step -1
Select Case Asc(Mid(dirtyString, iPosition, 1))
Case 9, 10, 13, 16, 32, 160
cleanString = cleanString
Case Else
cleanString = Left(dirtyString, iPosition)
Exit For
End Select
Next
fRemoveNonPrintableCharacters = cleanString
End Function
似乎这应该更简单,使用 Excel 清理功能。以下也适用:
myString = Worksheets("Sheet1").Range("A" & tRow).Value
myString = Application.WorksheetFunction.Clean(myString)
您还可以使用其他正常的和自制的 Excel 函数:
myString = Application.WorksheetFunction.Trim(myString)
仍然没有让 Substitute 函数以这种方式工作,但我正在努力。
这是我搜索要使用的快速函数时排名靠前的 google 结果,我有一个很好的旧 google 但没有真正完全解决我的问题。
主要问题是即使没有问题,所有这些函数都会触及原始字符串。这会减慢速度。
我重写了它,以便仅在出现错误字符时进行修正,还扩展到所有不可打印的字符和超出标准 ascii 的字符。
Public Function Clean_NonPrintableCharacters(Str As String) As String
'Removes non-printable characters from a string
Dim cleanString As String
Dim i As Integer
cleanString = Str
For i = Len(cleanString) To 1 Step -1
'Debug.Print Asc(Mid(Str, i, 1))
Select Case Asc(Mid(Str, i, 1))
Case 1 To 31, Is >= 127
'Bad stuff
'https://www.ionos.com/digitalguide/server/know-how/ascii-codes-overview-of-all-characters-on-the-ascii-table/
cleanString = Left(cleanString, i - 1) & Mid(cleanString, i + 1)
Case Else
'Keep
End Select
Next i
Clean_NonPrintableCharacters = cleanString
End Function
当出现Unicode字符时,应修改此处显示的代码。我的提案包含程序无法识别的字符:
Public Function Clean_NonPrintableCharacters(Str As String) As String
'Removes non-printable characters from a string
Dim cleanString As String
Dim i As Integer
cleanString = Str
For i = Len(cleanString) To 1 Step -1
If Chr(Asc(Mid(cleanString, i, 1))) <> Mid(cleanString, i, 1) Then
cleanString = Left(cleanString, i - 1) & Mid(cleanString, i + 1)
End If
Next i
Clean_NonPrintableCharacters = WorksheetFunction.Clean(cleanString)
End Function
可以通过RegEx解决(在Tools - References in VBE中添加MS VBScript Regular Expression):
Function NormalString(text As String, Optional filler = vbNullString) As String
Dim re As New RegExp
With re
.Pattern = "([\x00-\x1F\xA0])"
.Global = True
text = .Replace(text, filler)
End With
NormalString = text
End Function
如果有特殊字符,则用空格替换
If InStr(TextData, Chr(9)) > 0 Then TextData = Replace(TextData, Chr(9), Chr(32))
If InStr(TextData, Chr(10)) > 0 Then TextData = Replace(TextData, Chr(10), Chr(32))
If InStr(TextData, Chr(13)) > 0 Then TextData = Replace(TextData, Chr(13), Chr(32))
If InStr(TextData, Chr(16)) > 0 Then TextData = Replace(TextData, Chr(16), Chr(32))
我需要以编程方式删除不可打印的字符,例如:
制表符-字符(9) 换行符 - char(10) 马车 return - 字符 (13) 数据 link 转义 - char(16)
我启动了一个通用函数,该函数将从 ms 访问表单字段的 lost_focus 事件中调用。
我还没有弄清楚如何识别字符串何时包含不需要的字符。
Function RemoveNonPrintableCharacters(ByVal TextData) As String
Dim dirtyString As String
Dim cleanString As String
Dim iPosition As Integer
If IsNull(TextData) Then
Exit Function
End If
dirtyString = TextData
cleanString = ""
For iPosition = 1 To Len(dirtyString)
Select Case Asc(Mid(dirtyString, iPosition, 1))
Case 9 ' Char(9)
Case 10 ' Char(10)
Case 13 ' Char(13)
Case 16 ' Char(16)
Case Else ' Add character to clean field.
cleanString = cleanString & Mid(dirtyString, iPosition, 1)
End Select
Next
RemoveNonPrintableCharacters = cleanString
End Function
这些是我在测试时一直使用的 2 个字符串:
This line, has multiple, tabs that need to be removed
This line, has multiple,
line
breaks
that
need to be removed
This line, has multiple, tabs that need to be removed
And
Also contains
multiple,
line
breaks
that
need to be removed
Function RemoveNonPrintableCharacters(ByVal TextData) As String
Dim dirtyString As String
Dim cleanString As String
Dim iPosition As Integer
If IsNull(TextData) Then
Exit Function
End If
dirtyString = TextData
cleanString = ""
For iPosition = 1 To Len(dirtyString)
Select Case Asc(Mid(dirtyString, iPosition, 1))
Case 9, 10, 13, 16
cleanString = cleanString & " "
Case Else
cleanString = cleanString & Mid(dirtyString, iPosition, 1)
End Select
Next
RemoveNonPrintableCharacters = cleanString
End Function
A = Chr(09) & "Cat" & Chr(10) & vbcrlf
A = Replace(A, Chr(10))
A = Replace(A, Chr(13))
A = Replace(A, Chr(09))
Msgbox A
这是人们通常的做法。
您的代码创建了大量隐式变量。
'首先你需要找到一个角色
你的Str = "Bla bla bla..."
如果 instr(YourStr, chr(10)) > 0 那么
NewStr = Replace(YourStr, Chr(10),"")
如果
结束我正在用 space 个字符 chr(32) 替换 non-printable 个字符,但您可以根据需要更改它。
Function RemoveNonPrintableCharacters(ByVal TextData) As String
Dim sClean$
sClean = Replace(TextData, Chr(9), Chr(32))
sClean = Replace(sClean, Chr(10), Chr(32))
sClean = Replace(sClean, Chr(13), Chr(32))
sClean = Replace(sClean, Chr(16), Chr(32))
RemoveNonPrintableCharacters = sClean
End Function
这仅用于删除字符串右侧的 Non-printing 个字符,而不用空格替换这些字符。
Function fRemoveNonPrintableCharacters(ByVal TextData) As String
Dim dirtyString As String
Dim cleanString As String
Dim iPosition As Integer
If IsNull(TextData) Then
Exit Function
End If
dirtyString = TextData
cleanString = ""
For iPosition = Len(dirtyString) To 1 Step -1
Select Case Asc(Mid(dirtyString, iPosition, 1))
Case 9, 10, 13, 16, 32, 160
cleanString = cleanString
Case Else
cleanString = Left(dirtyString, iPosition)
Exit For
End Select
Next
fRemoveNonPrintableCharacters = cleanString
End Function
似乎这应该更简单,使用 Excel 清理功能。以下也适用:
myString = Worksheets("Sheet1").Range("A" & tRow).Value
myString = Application.WorksheetFunction.Clean(myString)
您还可以使用其他正常的和自制的 Excel 函数:
myString = Application.WorksheetFunction.Trim(myString)
仍然没有让 Substitute 函数以这种方式工作,但我正在努力。
这是我搜索要使用的快速函数时排名靠前的 google 结果,我有一个很好的旧 google 但没有真正完全解决我的问题。
主要问题是即使没有问题,所有这些函数都会触及原始字符串。这会减慢速度。
我重写了它,以便仅在出现错误字符时进行修正,还扩展到所有不可打印的字符和超出标准 ascii 的字符。
Public Function Clean_NonPrintableCharacters(Str As String) As String
'Removes non-printable characters from a string
Dim cleanString As String
Dim i As Integer
cleanString = Str
For i = Len(cleanString) To 1 Step -1
'Debug.Print Asc(Mid(Str, i, 1))
Select Case Asc(Mid(Str, i, 1))
Case 1 To 31, Is >= 127
'Bad stuff
'https://www.ionos.com/digitalguide/server/know-how/ascii-codes-overview-of-all-characters-on-the-ascii-table/
cleanString = Left(cleanString, i - 1) & Mid(cleanString, i + 1)
Case Else
'Keep
End Select
Next i
Clean_NonPrintableCharacters = cleanString
End Function
当出现Unicode字符时,应修改此处显示的代码。我的提案包含程序无法识别的字符:
Public Function Clean_NonPrintableCharacters(Str As String) As String
'Removes non-printable characters from a string
Dim cleanString As String
Dim i As Integer
cleanString = Str
For i = Len(cleanString) To 1 Step -1
If Chr(Asc(Mid(cleanString, i, 1))) <> Mid(cleanString, i, 1) Then
cleanString = Left(cleanString, i - 1) & Mid(cleanString, i + 1)
End If
Next i
Clean_NonPrintableCharacters = WorksheetFunction.Clean(cleanString)
End Function
可以通过RegEx解决(在Tools - References in VBE中添加MS VBScript Regular Expression):
Function NormalString(text As String, Optional filler = vbNullString) As String
Dim re As New RegExp
With re
.Pattern = "([\x00-\x1F\xA0])"
.Global = True
text = .Replace(text, filler)
End With
NormalString = text
End Function
如果有特殊字符,则用空格替换
If InStr(TextData, Chr(9)) > 0 Then TextData = Replace(TextData, Chr(9), Chr(32))
If InStr(TextData, Chr(10)) > 0 Then TextData = Replace(TextData, Chr(10), Chr(32))
If InStr(TextData, Chr(13)) > 0 Then TextData = Replace(TextData, Chr(13), Chr(32))
If InStr(TextData, Chr(16)) > 0 Then TextData = Replace(TextData, Chr(16), Chr(32))