模糊字符串匹配 Excel
Fuzzy string matching Excel
我目前需要一个模糊字符串匹配算法。我从这里给出的 link 中找到了一个 VBA 代码:Fuzzy Matching.
Function FuzzyFind(lookup_value As String, tbl_array As Range) As String
Dim i As Integer, str As String, Value As String
Dim a As Integer, b As Integer, cell As Variant
For Each cell In tbl_array
str = cell
For i = 1 To Len(lookup_value)
If InStr(cell, Mid(lookup_value, i, 1)) > 0 Then
a = a + 1
cell = Mid(cell, 1, InStr(cell, Mid(lookup_value, i, 1)) - 1) & Mid(cell, InStr(cell, Mid(lookup_value, i, 1)) + 1, 9999)
End If
Next i
a = a - Len(cell)
If a > b Then
b = a
Value = str
End If
a = 0
Next cell
FuzzyFind = Value
End Function
但是,无论“正确答案”有多远,这都会为您提供匹配。有什么方法可以实现函数给出“N/A”,比方说,距离原始字符串 4 个字符或更多?
试试这个,看看它是否是您想要的。它大致基于您那里的那个。
编辑:做了更多测试,发现我的原始版本不太正确。这应该会更好,但几乎不可能让这样的东西适用于所有可能发生的情况。
Function FuzzyFind(lookup_value As String, tbl_array As Range) As String
Dim i As Long, cell As Range, Matches As Long, LengthError As Long, _
FuzzyValue As String, FuzzyMatch As Long, L As String, C As String, MultipleReturns As Boolean
For Each cell In tbl_array
Matches = 0
If cell.Value <> "" Then
L = UCase(lookup_value)
C = UCase(cell.Value)
For i = 1 To Len(L)
If InStr(Mid(L, i, Len(L) - i), Mid(C, i, 1)) > 0 Then
Matches = Matches + 1
Else
Matches = Matches - 1
End If
Next i
LengthError = Abs(Len(C) - Len(L))
Matches = Matches - LengthError
If Len(L) - Matches <= 4 And Matches >= FuzzyMatch Then
If Matches = FuzzyMatch Then
MultipleReturns = True
Exit For
End If
FuzzyValue = cell.Value
FuzzyMatch = Matches
End If
End If
Next
If FuzzyValue <> "" Then
If MultipleReturns = True Then
FuzzyFind = "N/A (Multiple Returns)"
Else
FuzzyFind = FuzzyValue
End If
Else
FuzzyFind = "N/A"
End If
End Function
我目前需要一个模糊字符串匹配算法。我从这里给出的 link 中找到了一个 VBA 代码:Fuzzy Matching.
Function FuzzyFind(lookup_value As String, tbl_array As Range) As String
Dim i As Integer, str As String, Value As String
Dim a As Integer, b As Integer, cell As Variant
For Each cell In tbl_array
str = cell
For i = 1 To Len(lookup_value)
If InStr(cell, Mid(lookup_value, i, 1)) > 0 Then
a = a + 1
cell = Mid(cell, 1, InStr(cell, Mid(lookup_value, i, 1)) - 1) & Mid(cell, InStr(cell, Mid(lookup_value, i, 1)) + 1, 9999)
End If
Next i
a = a - Len(cell)
If a > b Then
b = a
Value = str
End If
a = 0
Next cell
FuzzyFind = Value
End Function
但是,无论“正确答案”有多远,这都会为您提供匹配。有什么方法可以实现函数给出“N/A”,比方说,距离原始字符串 4 个字符或更多?
试试这个,看看它是否是您想要的。它大致基于您那里的那个。
编辑:做了更多测试,发现我的原始版本不太正确。这应该会更好,但几乎不可能让这样的东西适用于所有可能发生的情况。
Function FuzzyFind(lookup_value As String, tbl_array As Range) As String
Dim i As Long, cell As Range, Matches As Long, LengthError As Long, _
FuzzyValue As String, FuzzyMatch As Long, L As String, C As String, MultipleReturns As Boolean
For Each cell In tbl_array
Matches = 0
If cell.Value <> "" Then
L = UCase(lookup_value)
C = UCase(cell.Value)
For i = 1 To Len(L)
If InStr(Mid(L, i, Len(L) - i), Mid(C, i, 1)) > 0 Then
Matches = Matches + 1
Else
Matches = Matches - 1
End If
Next i
LengthError = Abs(Len(C) - Len(L))
Matches = Matches - LengthError
If Len(L) - Matches <= 4 And Matches >= FuzzyMatch Then
If Matches = FuzzyMatch Then
MultipleReturns = True
Exit For
End If
FuzzyValue = cell.Value
FuzzyMatch = Matches
End If
End If
Next
If FuzzyValue <> "" Then
If MultipleReturns = True Then
FuzzyFind = "N/A (Multiple Returns)"
Else
FuzzyFind = FuzzyValue
End If
Else
FuzzyFind = "N/A"
End If
End Function