如果字符串是较长单词的一部分,则查找/替换以排除
Find / Replace to exclude if string is part of longer word
我正在尝试搜索三个字母的目标词并将其替换为更正后的三个字母的词。
例如
CHI - 作为单个单元格条目(带连字符)替换为“ORD -”。
在某些情况下,目标词是单元格中词对的一部分,例如CHI - SHA.
下面的代码捕获了这些情况。
我意识到当细胞是例如XIANCHI - SHA 它还会更正部分“CHI -”,导致 XIANORD - SHA。
如果目标字母是较长单词的一部分,我如何限制 fndlist 跳过目标字母?
样本
- CHI -(单个单元格输入)转换为 ORD -
- CHI - PVG(一个电池)转换为 ORD - PVG
- XIANCHI - PVG 转换为 XIANORD - PVG (错误)
如果我使用 lookat:xlwhole 代码只会捕捉到 CHI - 大小写而不是配对,但是如果我使用 xlpart 它会捕捉到 CHI - PVG 对但还会更正它在该元素中找到的任何单词。
Sub Adjust_Airport_Codes2()
Dim sht As Worksheet
Dim fndList As Variant
Dim rplcList As Variant
Dim x As Long
fndList = Array("BUE -", "CHI -", "DCA -", "HOU -", "LGA -", "NYC -", "WAS -", "AEJ -", "BUS -", "CGH -", "CPS -", "DGM -", "EHA -", "EHB -", "EHF -", "FOQ -", "FQC -", "JBN -", "LCY -", "LGW -", "LIN -", "LON -", "MIL -", "MOW -", "NAY -", "ORY -", "OSA -", "PAR -", "PUS -", "QPG -", "RIO -", "SAO -", "SAW -", "SDU -", "SDV -", "SEL -", "PVG -", "TSF -", "TYO -", "UAQ -", "VIT -", "YMX -", "YTO -", "ZIS -", "CNF -", "HND -", "IZM -", "JKT -", "LTN -", "MMA -", "UXM -", "VCE -", "VSS -")
rplcList = Array("EZE -", "ORD -", "IAD -", "IAH -", "JFK -", "JFK -", "IAD -", "AMS -", "ICN -", "GRU -", "VCP -", "HKG -", "AMS -", "BRU -", "HHN -", "HKG -", "FRA -", "PRG -", "LHR -", "LHR -", "MXP -", "LHR -", "MXP -", "SVO -", "PEK -", "CDG -", "KIX -", "CDG -", "ICN -", "SIN -", "GIG -", "GRU -", "IST -", "GIG -", "TLV -", "ICN -", "SHA -", "MXP -", "NRT -", "EZE -", "BIO -", "YUL -", "YYZ -", "HKG -", "BHZ -", "NRT -", "ADB -", "CGK -", "LHR -", "MMX -", "FRA -", "MXP -", "MHG -")
'Loop through each item in Array lists
For x = LBound(fndList) To UBound(fndList)
'Loop through each worksheet in ActiveWorkbook
For Each sht In ActiveWorkbook.Worksheets
sht.Cells.Replace What:=fndList(x), Replacement:=rplcList(x), _
LookAt:=xlpart, SearchOrder:=xlByRows, MatchCase:=True, _
SearchFormat:=False, ReplaceFormat:=False
Next sht
Next x
End Sub
编辑:
我想给你一些更完整的东西。在下面的代码中,我使用了一个单独的函数来创建前后值之间的映射。这清理了代码,因为现在所有这些值都存储在一个地方(也更容易维护)。然后我使用此对象创建搜索模式,因为正则表达式可以一次搜索多个模式。最后,我用字典来 return 替换值。试试这个修改后的代码,看看它是否更符合您的用例。
我 运行 快速性能测试,看看它是否比 built-in VBA 执行 better/worse 替换功能。在我的测试中,我只使用了正则表达式 search/replace 中的三种可能性,并且我 运行 对 103k 行进行了测试。它的性能与仅使用一个值的 built-in 搜索和替换一样好。对于每个搜索值,搜索和替换必须是 re-run。
如果有帮助请告诉我。
Function GetMap() As Object
Dim oMap As Object
Set oMap = CreateObject("Scripting.Dictionary")
oMap.Add "BUE -", "EZE -"
oMap.Add "CHI -", "ORD -"
oMap.Add "DCA -", "IAD -"
''' Add the rest of the mapped items
'''
'''
Set GetMap = oMap
End Function
Sub TestIt()
Dim oReg As Object
Dim oMap As Object
Dim m As Object
Dim rng As Range
Dim cel As Range
Dim t As Double
Set oReg = CreateObject("VbScript.Regexp")
Set oMap = GetMap()
With oReg
.Global = False
'Multiple patterns can be searched at once by
'separating them with pipes. Since we have the
'patterns to search for in the oMap dictionary,
'we can simply join it here. The benefit is that if
'you have to support new items, you only have to add
'them in the GetMap() function
.Pattern = "^" & Join(oMap.Keys, "|^")
End With
'Set your range appropriately
Set rng = Range("A1:A103680")
t = Timer
Application.ScreenUpdating = False
For Each cel In rng
If oReg.Test(cel.Value) Then
Set m = oReg.Execute(cel.Value)
cel.Value = oReg.Replace(cel.Value, oMap(m(0).Value))
End If
Next cel
Debug.Print "Process took " & Timer - t & " seconds."
Application.ScreenUpdating = True
End Sub
您可以考虑使用正则表达式模式匹配。使用模式匹配时,可以使用^
符号来表示字符串的开始。请参阅下面的代码作为一个简单示例,并尝试将其插入您的代码中。如果您 运行 遇到问题,请告诉我们。
Sub Tester()
Dim oReg As Object
Set oReg = CreateObject("VbScript.Regexp")
With oReg
.Global = False
.Pattern = "^CHI -"
End With
'Will return: ORD PVG
If oReg.test("CHI - PVG") Then
Debug.Print oReg.Replace("CHI - PVG", "ORD")
End If
'Won't trigger below
If oReg.test("XIANCHI - PVG") Then
Debug.Print oReg.Replace("XIANCHI - PVG", "ORD")
End If
End Sub
使用列表替换
- 注意:此代码用值替换每个工作表中的整个范围。如果有公式,则为'lost'.
- 我不明白为什么你需要“-”所以我删除了它们。如果需要,请添加它们。
流程(并非所有步骤和一些不准确之处)
- 将数组中的值写入字典。
- 遍历每个工作表。
- 将其使用范围(数据范围)中的值写入数据数组。
- 遍历数据数组中的每个元素。
- 如果没有错误或空值,则检查元素。
- 按 space 字符拆分为当前值数组。
- 根据字典的键检查当前值数组中的每个元素,如果找到则将其替换为字典的值。
- 重新连接当前值数组中的元素。并将可能修改的值写回数据数组中的当前元素。
- 将数据数组中可能修改的值写回数据范围。
代码
Option Explicit
Sub Adjust_Airport_Codes2()
' Define Find and Replace Arrays.
' Define Find Array.
Dim fndList As Variant
fndList = Array("BUE", "CHI", "DCA", "HOU", "LGA", "NYC", "WAS", "AEJ", _
"BUS", "CGH", "CPS", "DGM", "EHA", "EHB", "EHF", "FOQ", _
"FQC", "JBN", "LCY", "LGW", "LIN", "LON", "MIL", "MOW", _
"NAY", "ORY", "OSA", "PAR", "PUS", "QPG", "RIO", "SAO", _
"SAW", "SDU", "SDV", "SEL", "PVG", "TSF", "TYO", "UAQ", _
"VIT", "YMX", "YTO", "ZIS", "CNF", "HND", "IZM", "JKT", _
"LTN", "MMA", "UXM", "VCE", "VSS")
' Define Replace Array.
Dim rplcList As Variant
rplcList = Array("EZE", "ORD", "IAD", "IAH", "JFK", "JFK", "IAD", "AMS", _
"ICN", "GRU", "VCP", "HKG", "AMS", "BRU", "HHN", "HKG", _
"FRA", "PRG", "LHR", "LHR", "MXP", "LHR", "MXP", "SVO", _
"PEK", "CDG", "KIX", "CDG", "ICN", "SIN", "GIG", "GRU", _
"IST", "GIG", "TLV", "ICN", "SHA", "MXP", "NRT", "EZE", _
"BIO", "YUL", "YYZ", "HKG", "BHZ", "NRT", "ADB", "CGK", _
"LHR", "MMX", "FRA", "MXP", "MHG")
' Write values from Find and Replace Arrays to the Dictionary.
Dim dict As Object ' The Dictionary
Set dict = CreateObject("Scripting.Dictionary")
Dim n As Long ' Find and Replace Arrays Element Counter
For n = LBound(fndList) To UBound(fndList)
dict(fndList(n)) = rplcList(n)
Next n
' Find and replace values in each worksheet of the ActiveWorkbook.
' Declare variables to be used in loop.
Dim sht As Worksheet ' Current Worksheet
Dim rng As Range ' Current Data Range
Dim Data As Variant ' Current Data Array
Dim CurVal As Variant ' Current Value:
' The value of the current element of Data Array
Dim CurValues As Variant ' Current Values Array:
' The 'words' contained in current element
' of Data Array
Dim i As Long ' Data Array Rows Counter
Dim j As Long ' Data Array Columns Counter
Dim DataChanged As Boolean ' Data Changed Switch
' Iterate worksheets in ActiveWorkbook.
For Each sht In ActiveWorkbook.Worksheets
' Define Data Range (there are other ways).
Set rng = sht.UsedRange
' Write values from Data Range to Data Array.
If rng.Rows.Count > 1 Or rng.Columns.Count > 1 Then
Data = rng.Value
Else
ReDim Data(1 To 1, 1 To 1)
Data(1, 1) = rng.Value
End If
' Iterate rows in Data Array.
For i = 1 To UBound(Data, 1)
' Iterate columns in Data Array.
For j = 1 To UBound(Data, 2)
' Write value of current element to Current Value.
CurVal = Data(i, j)
' Check if Current Value is not an error or empty value.
If Not IsError(CurVal) And Not IsEmpty(CurVal) Then
' Split Current Value by the space character into
' Current Values Array.
CurValues = Split(CurVal)
' Iterate elements of Current Values Array.
For n = LBound(CurValues) To UBound(CurValues)
' Check if they exist as a Key in the Dictionary.
If dict.Exists(CurValues(n)) Then
' Write value of Dictionary to current element
' in Current Values Array.
CurValues(n) = dict(CurValues(n))
DataChanged = True
' You can increase performance if you're expecting
' only one possibly found value per cell:
'Exit For
End If
Next n
' Write elements of Current Values Array, joined with
' the space character, to current element in Data Array.
If DataChanged Then
Data(i, j) = Join(CurValues)
DataChanged = False
End If
End If
Next j
Next i
' Write values from Data Array to Data Range.
rng.Value = Data
Next sht
End Sub
我正在尝试搜索三个字母的目标词并将其替换为更正后的三个字母的词。
例如
CHI - 作为单个单元格条目(带连字符)替换为“ORD -”。
在某些情况下,目标词是单元格中词对的一部分,例如CHI - SHA.
下面的代码捕获了这些情况。
我意识到当细胞是例如XIANCHI - SHA 它还会更正部分“CHI -”,导致 XIANORD - SHA。
如果目标字母是较长单词的一部分,我如何限制 fndlist 跳过目标字母?
样本
- CHI -(单个单元格输入)转换为 ORD -
- CHI - PVG(一个电池)转换为 ORD - PVG
- XIANCHI - PVG 转换为 XIANORD - PVG (错误)
如果我使用 lookat:xlwhole 代码只会捕捉到 CHI - 大小写而不是配对,但是如果我使用 xlpart 它会捕捉到 CHI - PVG 对但还会更正它在该元素中找到的任何单词。
Sub Adjust_Airport_Codes2()
Dim sht As Worksheet
Dim fndList As Variant
Dim rplcList As Variant
Dim x As Long
fndList = Array("BUE -", "CHI -", "DCA -", "HOU -", "LGA -", "NYC -", "WAS -", "AEJ -", "BUS -", "CGH -", "CPS -", "DGM -", "EHA -", "EHB -", "EHF -", "FOQ -", "FQC -", "JBN -", "LCY -", "LGW -", "LIN -", "LON -", "MIL -", "MOW -", "NAY -", "ORY -", "OSA -", "PAR -", "PUS -", "QPG -", "RIO -", "SAO -", "SAW -", "SDU -", "SDV -", "SEL -", "PVG -", "TSF -", "TYO -", "UAQ -", "VIT -", "YMX -", "YTO -", "ZIS -", "CNF -", "HND -", "IZM -", "JKT -", "LTN -", "MMA -", "UXM -", "VCE -", "VSS -")
rplcList = Array("EZE -", "ORD -", "IAD -", "IAH -", "JFK -", "JFK -", "IAD -", "AMS -", "ICN -", "GRU -", "VCP -", "HKG -", "AMS -", "BRU -", "HHN -", "HKG -", "FRA -", "PRG -", "LHR -", "LHR -", "MXP -", "LHR -", "MXP -", "SVO -", "PEK -", "CDG -", "KIX -", "CDG -", "ICN -", "SIN -", "GIG -", "GRU -", "IST -", "GIG -", "TLV -", "ICN -", "SHA -", "MXP -", "NRT -", "EZE -", "BIO -", "YUL -", "YYZ -", "HKG -", "BHZ -", "NRT -", "ADB -", "CGK -", "LHR -", "MMX -", "FRA -", "MXP -", "MHG -")
'Loop through each item in Array lists
For x = LBound(fndList) To UBound(fndList)
'Loop through each worksheet in ActiveWorkbook
For Each sht In ActiveWorkbook.Worksheets
sht.Cells.Replace What:=fndList(x), Replacement:=rplcList(x), _
LookAt:=xlpart, SearchOrder:=xlByRows, MatchCase:=True, _
SearchFormat:=False, ReplaceFormat:=False
Next sht
Next x
End Sub
编辑: 我想给你一些更完整的东西。在下面的代码中,我使用了一个单独的函数来创建前后值之间的映射。这清理了代码,因为现在所有这些值都存储在一个地方(也更容易维护)。然后我使用此对象创建搜索模式,因为正则表达式可以一次搜索多个模式。最后,我用字典来 return 替换值。试试这个修改后的代码,看看它是否更符合您的用例。
我 运行 快速性能测试,看看它是否比 built-in VBA 执行 better/worse 替换功能。在我的测试中,我只使用了正则表达式 search/replace 中的三种可能性,并且我 运行 对 103k 行进行了测试。它的性能与仅使用一个值的 built-in 搜索和替换一样好。对于每个搜索值,搜索和替换必须是 re-run。
如果有帮助请告诉我。
Function GetMap() As Object
Dim oMap As Object
Set oMap = CreateObject("Scripting.Dictionary")
oMap.Add "BUE -", "EZE -"
oMap.Add "CHI -", "ORD -"
oMap.Add "DCA -", "IAD -"
''' Add the rest of the mapped items
'''
'''
Set GetMap = oMap
End Function
Sub TestIt()
Dim oReg As Object
Dim oMap As Object
Dim m As Object
Dim rng As Range
Dim cel As Range
Dim t As Double
Set oReg = CreateObject("VbScript.Regexp")
Set oMap = GetMap()
With oReg
.Global = False
'Multiple patterns can be searched at once by
'separating them with pipes. Since we have the
'patterns to search for in the oMap dictionary,
'we can simply join it here. The benefit is that if
'you have to support new items, you only have to add
'them in the GetMap() function
.Pattern = "^" & Join(oMap.Keys, "|^")
End With
'Set your range appropriately
Set rng = Range("A1:A103680")
t = Timer
Application.ScreenUpdating = False
For Each cel In rng
If oReg.Test(cel.Value) Then
Set m = oReg.Execute(cel.Value)
cel.Value = oReg.Replace(cel.Value, oMap(m(0).Value))
End If
Next cel
Debug.Print "Process took " & Timer - t & " seconds."
Application.ScreenUpdating = True
End Sub
您可以考虑使用正则表达式模式匹配。使用模式匹配时,可以使用^
符号来表示字符串的开始。请参阅下面的代码作为一个简单示例,并尝试将其插入您的代码中。如果您 运行 遇到问题,请告诉我们。
Sub Tester()
Dim oReg As Object
Set oReg = CreateObject("VbScript.Regexp")
With oReg
.Global = False
.Pattern = "^CHI -"
End With
'Will return: ORD PVG
If oReg.test("CHI - PVG") Then
Debug.Print oReg.Replace("CHI - PVG", "ORD")
End If
'Won't trigger below
If oReg.test("XIANCHI - PVG") Then
Debug.Print oReg.Replace("XIANCHI - PVG", "ORD")
End If
End Sub
使用列表替换
- 注意:此代码用值替换每个工作表中的整个范围。如果有公式,则为'lost'.
- 我不明白为什么你需要“-”所以我删除了它们。如果需要,请添加它们。
流程(并非所有步骤和一些不准确之处)
- 将数组中的值写入字典。
- 遍历每个工作表。
- 将其使用范围(数据范围)中的值写入数据数组。
- 遍历数据数组中的每个元素。
- 如果没有错误或空值,则检查元素。
- 按 space 字符拆分为当前值数组。
- 根据字典的键检查当前值数组中的每个元素,如果找到则将其替换为字典的值。
- 重新连接当前值数组中的元素。并将可能修改的值写回数据数组中的当前元素。
- 将数据数组中可能修改的值写回数据范围。
代码
Option Explicit
Sub Adjust_Airport_Codes2()
' Define Find and Replace Arrays.
' Define Find Array.
Dim fndList As Variant
fndList = Array("BUE", "CHI", "DCA", "HOU", "LGA", "NYC", "WAS", "AEJ", _
"BUS", "CGH", "CPS", "DGM", "EHA", "EHB", "EHF", "FOQ", _
"FQC", "JBN", "LCY", "LGW", "LIN", "LON", "MIL", "MOW", _
"NAY", "ORY", "OSA", "PAR", "PUS", "QPG", "RIO", "SAO", _
"SAW", "SDU", "SDV", "SEL", "PVG", "TSF", "TYO", "UAQ", _
"VIT", "YMX", "YTO", "ZIS", "CNF", "HND", "IZM", "JKT", _
"LTN", "MMA", "UXM", "VCE", "VSS")
' Define Replace Array.
Dim rplcList As Variant
rplcList = Array("EZE", "ORD", "IAD", "IAH", "JFK", "JFK", "IAD", "AMS", _
"ICN", "GRU", "VCP", "HKG", "AMS", "BRU", "HHN", "HKG", _
"FRA", "PRG", "LHR", "LHR", "MXP", "LHR", "MXP", "SVO", _
"PEK", "CDG", "KIX", "CDG", "ICN", "SIN", "GIG", "GRU", _
"IST", "GIG", "TLV", "ICN", "SHA", "MXP", "NRT", "EZE", _
"BIO", "YUL", "YYZ", "HKG", "BHZ", "NRT", "ADB", "CGK", _
"LHR", "MMX", "FRA", "MXP", "MHG")
' Write values from Find and Replace Arrays to the Dictionary.
Dim dict As Object ' The Dictionary
Set dict = CreateObject("Scripting.Dictionary")
Dim n As Long ' Find and Replace Arrays Element Counter
For n = LBound(fndList) To UBound(fndList)
dict(fndList(n)) = rplcList(n)
Next n
' Find and replace values in each worksheet of the ActiveWorkbook.
' Declare variables to be used in loop.
Dim sht As Worksheet ' Current Worksheet
Dim rng As Range ' Current Data Range
Dim Data As Variant ' Current Data Array
Dim CurVal As Variant ' Current Value:
' The value of the current element of Data Array
Dim CurValues As Variant ' Current Values Array:
' The 'words' contained in current element
' of Data Array
Dim i As Long ' Data Array Rows Counter
Dim j As Long ' Data Array Columns Counter
Dim DataChanged As Boolean ' Data Changed Switch
' Iterate worksheets in ActiveWorkbook.
For Each sht In ActiveWorkbook.Worksheets
' Define Data Range (there are other ways).
Set rng = sht.UsedRange
' Write values from Data Range to Data Array.
If rng.Rows.Count > 1 Or rng.Columns.Count > 1 Then
Data = rng.Value
Else
ReDim Data(1 To 1, 1 To 1)
Data(1, 1) = rng.Value
End If
' Iterate rows in Data Array.
For i = 1 To UBound(Data, 1)
' Iterate columns in Data Array.
For j = 1 To UBound(Data, 2)
' Write value of current element to Current Value.
CurVal = Data(i, j)
' Check if Current Value is not an error or empty value.
If Not IsError(CurVal) And Not IsEmpty(CurVal) Then
' Split Current Value by the space character into
' Current Values Array.
CurValues = Split(CurVal)
' Iterate elements of Current Values Array.
For n = LBound(CurValues) To UBound(CurValues)
' Check if they exist as a Key in the Dictionary.
If dict.Exists(CurValues(n)) Then
' Write value of Dictionary to current element
' in Current Values Array.
CurValues(n) = dict(CurValues(n))
DataChanged = True
' You can increase performance if you're expecting
' only one possibly found value per cell:
'Exit For
End If
Next n
' Write elements of Current Values Array, joined with
' the space character, to current element in Data Array.
If DataChanged Then
Data(i, j) = Join(CurValues)
DataChanged = False
End If
End If
Next j
Next i
' Write values from Data Array to Data Range.
rng.Value = Data
Next sht
End Sub