使用 VBA 识别 Excel 字符串中的产品代码
Using VBA to identify a product code within an Excel string
快速后台:
我正在 Visual Basic 中创建一个搜索工具,它允许我在我的数据库中搜索名称不一致的 material,这些都是作为自由文本输入的。虽然我已经(在 Stack Overflow 用户的帮助下)开发了一个可以一次搜索数百个或多个项目的工具,但我需要进一步改进它。
我的问题:
我需要能够从这些 material 描述中提取项目代码。这些项目是一般数字,例如:20405-002 或者:A445 甚至 B463-563 .这些是我要搜索的主要代码类型,它们将是唯一标识符。
一些例子:
在意大利的一家工厂里,我有一个 material 命名为:
Siemens;Motor;A4002
在德国的一家工厂,它被称为:
Motor;FP4742;Siemens;TurnFast;A4002
我会搜索术语 Siemens, Motor
我当前的搜索是 return Siemens, Motor 在第一个旁边, Motor, Siemens 在第一个旁边第二。然后我希望 visual basic 在本质上说 'these could be the same part',然后在两者中查找匹配的代码。当它找到匹配代码时,我希望它在 excel 单元格中 return 某种指示器。
总体目标: 拥有一个工具,我可以用它来查找两个 material 是否实际上相同,并且需要最少的人工输入。两株植物中的每株最多可能有 50,000 materials。我也有这些零件的价格和供应商。虽然供应商有 75% 的时间是相同的,但价格通常与不同国家/地区的相同 material 相差 20% 以内。如果您对如何查看两个自由文本 material 是否实际上相同有任何其他想法,我很乐意听取。
我的搜索码:
Function MultiSplitX(ByVal SourceText As String, RemoveBlankItems As Boolean, ParamArray Delimiters()) As String()
Dim a As Integer, b As Integer, n As Integer
Dim i As Integer: i = 33
Dim u As Variant, v As Variant
Dim tempArr() As String, finalArr() As String, fDelimiters() As String
If InStr(TypeName(Delimiters(0)), "()") <> 0 And LBound(Delimiters) = UBound(Delimiters) Then
ReDim fDelimiters(LBound(Delimiters(0)) To UBound(Delimiters(0))) 'If passing array vs array items then
For a = LBound(Delimiters(0)) To UBound(Delimiters(0)) 'build that array
fDelimiters(a) = Delimiters(0)(a)
Next a
Else
fDelimiters = Delimiters(0)
End If
Do While InStr(SourceText, Chr(i)) <> 0 'Find an unused character
i = i + 1
Loop
For a = LBound(fDelimiters) To UBound(fDelimiters) 'Sort Delimiters by length
For b = a + 1 To UBound(fDelimiters)
If Len(fDelimiters(a)) < Len(fDelimiters(b)) Then
u = fDelimiters(b)
fDelimiters(b) = fDelimiters(a)
fDelimiters(a) = u
End If
Next b
Next a
For Each v In fDelimiters 'Replace Delimiters with a common character
SourceText = Replace(SourceText, v, Chr(i))
Next
tempArr() = Split(SourceText, Chr(i)) 'Remove empty array items
If RemoveBlankItems = True Then
ReDim finalArr(LBound(tempArr) To UBound(tempArr))
n = LBound(tempArr)
For i = LBound(tempArr) To UBound(tempArr)
If tempArr(i) <> "" Then
finalArr(n) = tempArr(i)
n = n + 1
End If
Next i
n = n - 1
ReDim Preserve finalArr(LBound(tempArr) To n)
MultiSplitX = finalArr
Else: MultiSplitX = tempArr
End If
Erase finalArr
Erase tempArr
End Function
感谢大家的帮助:)
这是用 VBA 为 Excel 编写的响应,但使用数组 get/put 数据,因此您应该能够轻松地为数据库修改它。 VB非常相似。如果我要完成这项工作,我会在 MS Access 中完成,在这种情况下,您可以更轻松地调整此代码。当然,直接 VB 始终是一种选择。 VB 不是一个很好的工具。
如果您经常处理数据,我强烈建议您学习免费和开源的 Python 语言。您可以在 Youtube 上找到来自 Sentdex 的精彩 Python 视频系列。他的视频很好,很慢。您将很快超越 VB.
所能达到的水平
由于缺乏细节和小样本数据,很难全面回答这个问题。
有很多方法可以解决这个问题,具体取决于您想要的输出。我正在做以下假设。
- 您是编码新手,希望输出易于阅读。作为
因此,我的解决方案默认为单个 2x2 结果数组。你
可以通过设置 DeepArr = True 将其更改为 3+ 维。
- 您希望将结果粘贴到同一作品中sheet。
- 您有一个单独的 supplier/vendor 代码列表,可以在零件代码中找到。 GuessSupplier 函数取决于此假设。如果需要,根据实际需求更新功能。
- 我将您的原始输入(如 Siemens;Motor;A4002)称为零件代码。
- 我假设最后 semi-colon 之后的文本将始终是该部分
数字。如果没有,您可以轻松地替换该假设
GuessPartNum 函数。
下面介绍我用来简单测试的点差sheet。
Sheet "PartCodes" 在单列中包含零件代码,在单元格 B3:B6 中包含示例值(B2 中的 header)。列 G-H 保留用于结果。
Sheet "Suppliers" 在单个列中包含唯一的供应商列表 (B3:B6)。
您可以在 RunMain() 子程序中为输入和输出指定 sheet 名称和范围。
为方便起见,我在某些地方对 sheet 名称进行了硬编码。你应该把这些作为论据浮出水面。
为了便于理解,代码有些冗长。
我没有测试性能,因为我没有数据集,希望你 运行 很少这样做。
我只添加了少量的错误处理。
我的完整代码集如下。您会在底部附近找到 RunMain() 子程序。这将启动控制工作流的 Main() sub。
Option Base 0
Option Explicit
' 1) Manually eliminate duplicates in your parts list using Excel built-in feature.
' a) highlight the range
' b) Data ribbon > Remove Duplicates
' 2) Create a supplier list on a separate sheet in teh same workbook
' 3) Edit the RunMain() procedure per your data. I assume: your part code list
' - part code list is in cells B3:B10 of the PartCodes sheet.
' - supplier list in cells b4:b6 of Suppliers sheet.
' - output goes to D2 in PartCodes sheet.
' 4) Run the RunMain() procedure simply kicks off Main.
' Main() sub does the following:
' a)Run ProcessPartCodes:
' i. load the parts codes from the worksheet into an array
' ii. run GuessPartNum and GuessSupplier and place results in the parts code array.
' b) Run FindMatches to add more to the array. Finds other part codes that may be for the same part.
' Logic is described in the function.
' c) Run ArrayToRange to paste part of the result set to the workseet. Note that
' the ourput array is more than 2 dimensions, so not all data is pasted neatly.
' I leave it to you to determine how you want to format the data for output.
'
Function RangeToArray(inputRange As Range)
'Copies values from a rectangular range to a 2D Array.
'Array is always 2D, even if data is a single column or row.
'inputRange: a rectangular range
Dim Col1 As Integer, row1 As Integer
Dim i As Integer, j As Integer
Dim rowCnt As Integer
Dim colCnt As Integer
Dim retArr() As Variant
' Size output array
rowCnt = inputRange.Rows.Count
colCnt = inputRange.Columns.Count
ReDim retArr(1 To rowCnt, 1 To colCnt) As Variant
' Load range values into array
For i = 1 To rowCnt
For j = 1 To (colCnt)
retArr(i, j) = Trim(inputRange.Cells(i, j))
Next j
Next i
' Return array
RangeToArray = retArr
End Function
Sub ArrayToRange(myArr As Variant, Target As Range)
' Copies the content of a 2D array to a Range.
' myArr must be exactly 2 dimensions
' Target is a range. If more than 1 cell, the top left cell is used.
' Copies the array to the range starting with the top left cell.
' Target Range size can be a single cell and need not match the array dimensions.
Dim r As Long, tgtRow As Long
Dim c As Long, tgtCol As Long
Dim firstRow As Long
Dim firstCol As Long
Dim lastRow As Long
Dim lastCol As Long
' Find the top left cell of the Target Range
tgtRow = Target.Row
tgtCol = Target.Column
' Set target range dimesions based on array size.
firstRow = tgtRow + LBound(myArr, 1)
firstCol = tgtCol + LBound(myArr, 2)
lastRow = tgtRow + UBound(myArr, 1)
lastCol = tgtCol + UBound(myArr, 2)
' The next row would usually work. If you get funky data, it will fail,
' so, we will use a loop instead.
' Range(Cells(firstRow, firstCol), Cells(lastRow, lastCol)) = myArr
' Loop through rows and columns, setting cell values one at a time.
For r = LBound(myArr, 1) To UBound(myArr, 1)
For c = LBound(myArr, 2) To UBound(myArr, 2)
On Error Resume Next ' Prevent one bad value from killing the entire operation.
Cells(tgtRow + r - 1, tgtCol + c) = myArr(r, c)
On Error GoTo 0
Next c
Next r
End Sub
' Not used, this is just an example
'Public Function RangeCorners(Optional MyRange As Range = Range("c2:c10"))
' TopLeft = MyRange.Cells(1)
' BottomLeft = MyRange.Cells(.Rows.Count, 1)
' TopRight = MyRange.Cells(1, .Columns.Count)
' BottomRigt = MyRange.Cells(.Cells.Count)
' RangeCorners = Array(TopLeft, TopRight, BottomLeft, BottomRight)
'End Function
Function IsInArray(stringToBeFound As String, arr As Variant) As Boolean
'Returns True if stringToBeFound is in the array (arr); else False
'This one-liner need not be in a fucntion, but makes reading code easier.
IsInArray = (UBound(Filter(arr, stringToBeFound)) > -1)
End Function
Function GuessPartNum(splitPartCode As Variant, Optional delim As String = ";")
' Find a way to determine what part of the partCode is the part number.
' Perhaps it is always last. Perhaps it always has at least 3 digits.
' Simply takes the last item from the part code. Update this logic to whatever
' makes sense for your dataset (which I could nto see when writing this).
GuessPartNum = splitPartCode(UBound(splitPartCode))
End Function
Function GuessSupplier(splitPartCode As Variant, supplierList As Variant, Optional delim As String = ";")
' Determine the supplier of this part from the partCode.
' For each supplier in the supplierList, see if the supplier name is in the partCode.
Dim i As Integer
For i = LBound(supplierList) To UBound(supplierList)
'Simply verifies if a supplier from supplierList is in the part code. Uses first match.
If (UBound(Filter(splitPartCode, supplierList(i, 1))) > -1) Then 'if arr(i) is in supplier_array
GuessSupplier = supplierList(i, 1)
Exit Function
End If
Next i
End Function
Function ProcessPartCodes(partCodeRange As Range, supplierListRange As Range, Optional delim As String = ";")
' Main ProcessPartCodes
'
' PartCodeRange: a range representing the part code list;
' must be in single column form.
' SupplierList: array of supplier names as strings
'
' Load part code array into array
Dim resultArr As Variant 'result set as array
Dim supplierList As Variant
Dim splitCode As Variant
Dim i As Integer
resultArr = RangeToArray(partCodeRange)
ReDim Preserve resultArr(LBound(resultArr) To UBound(resultArr), 0 To 4) As Variant
supplierList = RangeToArray(supplierListRange)
' Get supplier and part num from each part code
For i = LBound(resultArr) To UBound(resultArr)
If Len(resultArr(i, 0)) > 0 Then
splitCode = Split(resultArr(i, 0), delim) ' Split the partCode by delimiters, semi-colon (;)
resultArr(i, 0) = resultArr(i, 0) ' Part Code (not parsed)
resultArr(i, 1) = GuessPartNum(splitCode) ' Part Number
resultArr(i, 2) = GuessSupplier(splitCode, supplierList) ' Supplier
resultArr(i, 3) = splitCode ' Part Code (parsed)
'resultArr(i, 4) ' reserved for match information
Else
' Empty array element.
splitCode = ""
resultArr(i, 3) = Array()
End If
Next i
ProcessPartCodes = resultArr
End Function
Function CompareParts(splitPartCode1 As Variant, splitPartCode2 As Variant)
'
'
'splitPartCode1 is an array of a parsed partCode string
'splitPartCode2 is an array of a parsed partCode string
Dim matches() As String
Dim i As Integer
Dim matchCnt As String
ReDim matches(0 To 0) As String
' Check each item in arr1 (each substring of partCode1) for a match in arr2
For i = LBound(splitPartCode1) To UBound(splitPartCode1)
If (UBound(Filter(splitPartCode2, splitPartCode1(i))) > -1) Then 'if arr1(i) is in arr2
' Found an item in splitPartCode1 (a substring in partCode1) that is also in splitPartCode2.
' Add this item to the list of matches.
If LBound(matches) = -1 Then
ReDim matches(0 To 0) As String
Else
ReDim Preserve matches(LBound(matches) To UBound(matches) + 1) As String ' grow the matches array by one
End If
matches(UBound(matches)) = splitPartCode1(i) ' set value of last item in matches() = this item (this substring of partCode1)
End If
Next i
matchCnt = UBound(matches) - LBound(matches) + 1 ' Total number of matching substrings from each part.
CompareParts = Array(matchCnt, matches)
End Function
Function FindMatches(partCodeArr As Variant, Optional DeepArr As Boolean = False)
' Fucntion compares 2 part numbers to determine likelihood of a match.
' Parses partCode1 and partCode2 using the delimiter into arrays of strings.
' Then counts the number of matching strings in each array.
' Then determines if the part numbers (assumed to be the last string of each array) match.
' After running this, you can use the match count (matchCnt integer) and part number match
' (partNumMatch boolean) as a basis for determining how likely it is that partCode1=partCode2.
'
'
' DeepArr: If True, returns 3+ dimensional array. If False, flattens results to 2D array.
'
' Returns: Array(partCode1, partCode2, partNum1, partNum2, matchCnt, pricePct, supplierMatch, partNumMatch)
' partCode1 = partCode1 input argument
' partCode2 = partCode2 input argument
' partNum1 = the portion (substring) of partCode1 after the last ocurrence of the delimiter, delim.
' partNum2 = the portion (substring) of partCode2 after the last ocurrence of the delimiter, delim.
' match (boolean) = True if parts are likely the same.
' matchCnt = number of matching sub-strings between partCode1 and part 2
' (essentially, a match score, where higher is more likely a positive match)
' Returns -1 if partCode1=partCode2, meaning exact match.
' pricePct = percentage price match calculated as (decimal portion of price1/price2) * 100
' partNumMatch = True is partNum1=partNum2; else False
Dim i As Integer, j As Integer, k As Integer
Dim partCodei, partCodej
Dim partNumi As String, partNumj As String, numMatch As Boolean
Dim Duplicate As Boolean, newMatch As Boolean
Dim partSupplieri As String, partSupplierj As String, supplierMatch As Boolean
Dim splitCodei() As String, splitCodej() As String, matchCnt As Integer
Dim splitCompare
Dim matches() As String 'empty array has LBound=0 and UBound=-1, so UBound-LBound=-1 indicates an empty array
Dim matchstr As String
Dim s As String
matchCnt = 0 ' matchCnt = UBound(matches) - LBound(matches) + 1 ' starting with 0 matches.
For i = LBound(partCodeArr) To UBound(partCodeArr)
If i = 1 Or i = UBound(partCodeArr) Or i Mod 100 = 0 _
Then Debug.Print "Starting record " & i & ": " & Now()
If partCodeArr(i, 0) <> "" Then
matchstr = ""
For j = i + 1 To UBound(partCodeArr)
If Len(partCodeArr(j, 0)) > 0 Then
partCodei = partCodeArr(i, 0)
partCodej = partCodeArr(j, 0)
Duplicate = partCodei = partCodej 'found duplicate entry in table.
partNumi = partCodeArr(i, 1)
partNumj = partCodeArr(j, 1)
numMatch = partNumi = partNumj
partSupplieri = partCodeArr(i, 2)
partSupplierj = partCodeArr(j, 2)
supplierMatch = partSupplieri = partSupplierj
splitCodei = partCodeArr(i, 3)
splitCodej = partCodeArr(j, 3)
splitCompare = CompareParts(splitCodei, splitCodej)
matchCnt = splitCompare(0)
newMatch = False
If Duplicate Then
' You should have removed duplicates before starting.
On Error GoTo redimErr
ReDim Preserve matches(0 To UBound(matches) + 1, 0 To 2) As String
On Error GoTo 0
newMatch = True
matches(UBound(matches), 0) = partCodej 'The duplicate partCode
matches(UBound(matches), 1) = "0" ' Matching score, where -1 indicates an exact duplicate.
matches(UBound(matches), 2) = "Duplicate Entry. Part codes are identical." ' Matching score, where -1 indicates an exact duplicate.
ElseIf supplierMatch And numMatch Then
' Possible duplicate part since supplier and part number both match.
On Error GoTo redimErr
ReDim Preserve matches(0 To UBound(matches) + 1, 0 To 2) As String
On Error GoTo 0
newMatch = True
matches(UBound(matches), 0) = partCodej 'The duplicate partCode
matches(UBound(matches), 1) = "1" ' Matching score, where -1 indicates an exact duplicate.
matches(UBound(matches), 2) = "Probably same part with differnt part code. Same supplier and part number." ' Matching score, where -1 indicates an exact duplicate.
ElseIf supplierMatch And matchCnt > 2 Then
' Possible duplicate part since supplier and part number both match.
On Error GoTo redimErr
ReDim Preserve matches(0 To UBound(matches) + 1, 0 To 2) As String
On Error GoTo 0
newMatch = True
matches(UBound(matches), 0) = partCodej 'The duplicate partCode
matches(UBound(matches), 1) = "2" ' Matching score, where -1 indicates an exact duplicate.
matches(UBound(matches), 2) = "Possible duplicate. More likely a similar part from same supplier" ' Matching score, where -1 indicates an exact duplicate.
ElseIf supplierMatch = False And matchCnt > 2 Then
' Possible duplicate part since supplier and part number both match.
On Error GoTo redimErr
ReDim Preserve matches(0 To UBound(matches) + 1, 0 To 2) As String
On Error GoTo 0
newMatch = True
matches(UBound(matches), 0) = partCodej 'The duplicate partCode
matches(UBound(matches), 1) = "3" ' Matching score, where -1 indicates an exact duplicate.
matches(UBound(matches), 2) = "Possible part match from different supplier" ' Matching score, where -1 indicates an exact duplicate.
ElseIf supplierMatch = False And matchCnt > 1 Then
' Possible duplicate part since supplier and part number both match.
On Error GoTo redimErr
ReDim Preserve matches(0 To UBound(matches) + 1, 0 To 2) As String
On Error GoTo 0
newMatch = True
matches(UBound(matches), 0) = partCodej 'The duplicate partCode
matches(UBound(matches), 1) = "4" ' Matching score, where -1 indicates an exact duplicate.
matches(UBound(matches), 2) = "Low probability part match from different supplier" ' Matching score, where -1 indicates an exact duplicate.
End If
If newMatch And Not DeepArr Then
For k = LBound(matches) To UBound(matches)
matchstr = matchstr & "[" & partCodej & "," & matches(UBound(matches), 1) & "," & matches(UBound(matches), 2) & "], "
Next k
End If
End If
Next j
If DeepArr Then
' return 3+ dimensional array
partCodeArr(i, 4) = matches
Else
' return 2D array for easier pasting to worksheet
' Flatten partCodeArr(i, 4), the parsed potential part matches to an ordinary string
' with format [[part code, match value, match description],[part code, match value, match description],...]
If Len(matchstr) > 0 Then
matchstr = "[ " & Left(matchstr, Len(matchstr) - 2) & "] "
End If
partCodeArr(i, 4) = matchstr
' Flatten the parsed part code back to original string format.
partCodeArr(i, 3) = partCodeArr(i, 0)
End If
ReDim matches(0) As String
End If
Next i
FindMatches = partCodeArr
Exit Function
redimErr:
ReDim matches(0 To 0, 0 To 2) As String
Resume Next
End Function
Sub RunMain()
' Kicks off Main(partCodeRange As Range, supplierListRange As Range, destination As Range)
'
' Arguments:
' partCodeRange = Excel Range (not string name of range)
' that contains the raw part code list
' supplierListRange = Excel Range (not string name of range)
' that contains a unique list of supplier
' codes found in the part codes.
'
Call Main(Sheets("PartCodes").Range("B3:B10"), Sheets("Suppliers").Range("B4:B6"), Range("PartCodes!D2"))
End Sub
Sub Main(partCodeRange As Range, supplierListRange As Range, destination As Range)
' This is the main sub that runs the full process of finding equivalent part
' codes and writing the findings to an excel worksheet.
' See RunMain() sub for example use.
'
' Arguments:
' partCodeRange = Excel Range (not string name of range)
' that contains the raw part code list
' supplierListRange = Excel Range (not string name of range)
' that contains a unique list of supplier
' codes found in the part codes.
'
Dim partCodesArr, matchArr
Dim startdate As Date, stopdate As Date
startdate = Now()
Debug.Print
Debug.Print String(70, "*")
Debug.Print
Debug.Print "Starting: " & startdate
Debug.Print
partCodesArr = ProcessPartCodes(partCodeRange, supplierListRange)
matchArr = FindMatches(partCodesArr) ' FindMatches(partCodesArr, True) for 3+ dimensional results
Sheets("PartCodes").Activate
'Write column headers.
destination.Offset(0, 0) = "Part Code"
destination.Offset(0, 1) = "Part Num"
destination.Offset(0, 2) = "Part Supplier"
destination.Offset(0, 3) = "Part Code"
destination.Offset(0, 4) = "Potential equivalent part numbers"
Call ArrayToRange(matchArr, destination.Offset(1, 0))
stopdate = Now()
Debug.Print
Debug.Print "Finished: " & stopdate
Debug.Print
Debug.Print "Run time: " & (stopdate - startdate)
Debug.Print
Debug.Print String(70, "*")
Debug.Print
End Sub
快速后台: 我正在 Visual Basic 中创建一个搜索工具,它允许我在我的数据库中搜索名称不一致的 material,这些都是作为自由文本输入的。虽然我已经(在 Stack Overflow 用户的帮助下)开发了一个可以一次搜索数百个或多个项目的工具,但我需要进一步改进它。
我的问题: 我需要能够从这些 material 描述中提取项目代码。这些项目是一般数字,例如:20405-002 或者:A445 甚至 B463-563 .这些是我要搜索的主要代码类型,它们将是唯一标识符。
一些例子:
在意大利的一家工厂里,我有一个 material 命名为:
Siemens;Motor;A4002
在德国的一家工厂,它被称为:
Motor;FP4742;Siemens;TurnFast;A4002
我会搜索术语 Siemens, Motor
我当前的搜索是 return Siemens, Motor 在第一个旁边, Motor, Siemens 在第一个旁边第二。然后我希望 visual basic 在本质上说 'these could be the same part',然后在两者中查找匹配的代码。当它找到匹配代码时,我希望它在 excel 单元格中 return 某种指示器。
总体目标: 拥有一个工具,我可以用它来查找两个 material 是否实际上相同,并且需要最少的人工输入。两株植物中的每株最多可能有 50,000 materials。我也有这些零件的价格和供应商。虽然供应商有 75% 的时间是相同的,但价格通常与不同国家/地区的相同 material 相差 20% 以内。如果您对如何查看两个自由文本 material 是否实际上相同有任何其他想法,我很乐意听取。
我的搜索码:
Function MultiSplitX(ByVal SourceText As String, RemoveBlankItems As Boolean, ParamArray Delimiters()) As String()
Dim a As Integer, b As Integer, n As Integer
Dim i As Integer: i = 33
Dim u As Variant, v As Variant
Dim tempArr() As String, finalArr() As String, fDelimiters() As String
If InStr(TypeName(Delimiters(0)), "()") <> 0 And LBound(Delimiters) = UBound(Delimiters) Then
ReDim fDelimiters(LBound(Delimiters(0)) To UBound(Delimiters(0))) 'If passing array vs array items then
For a = LBound(Delimiters(0)) To UBound(Delimiters(0)) 'build that array
fDelimiters(a) = Delimiters(0)(a)
Next a
Else
fDelimiters = Delimiters(0)
End If
Do While InStr(SourceText, Chr(i)) <> 0 'Find an unused character
i = i + 1
Loop
For a = LBound(fDelimiters) To UBound(fDelimiters) 'Sort Delimiters by length
For b = a + 1 To UBound(fDelimiters)
If Len(fDelimiters(a)) < Len(fDelimiters(b)) Then
u = fDelimiters(b)
fDelimiters(b) = fDelimiters(a)
fDelimiters(a) = u
End If
Next b
Next a
For Each v In fDelimiters 'Replace Delimiters with a common character
SourceText = Replace(SourceText, v, Chr(i))
Next
tempArr() = Split(SourceText, Chr(i)) 'Remove empty array items
If RemoveBlankItems = True Then
ReDim finalArr(LBound(tempArr) To UBound(tempArr))
n = LBound(tempArr)
For i = LBound(tempArr) To UBound(tempArr)
If tempArr(i) <> "" Then
finalArr(n) = tempArr(i)
n = n + 1
End If
Next i
n = n - 1
ReDim Preserve finalArr(LBound(tempArr) To n)
MultiSplitX = finalArr
Else: MultiSplitX = tempArr
End If
Erase finalArr
Erase tempArr
End Function
感谢大家的帮助:)
这是用 VBA 为 Excel 编写的响应,但使用数组 get/put 数据,因此您应该能够轻松地为数据库修改它。 VB非常相似。如果我要完成这项工作,我会在 MS Access 中完成,在这种情况下,您可以更轻松地调整此代码。当然,直接 VB 始终是一种选择。 VB 不是一个很好的工具。
如果您经常处理数据,我强烈建议您学习免费和开源的 Python 语言。您可以在 Youtube 上找到来自 Sentdex 的精彩 Python 视频系列。他的视频很好,很慢。您将很快超越 VB.
所能达到的水平由于缺乏细节和小样本数据,很难全面回答这个问题。
有很多方法可以解决这个问题,具体取决于您想要的输出。我正在做以下假设。
- 您是编码新手,希望输出易于阅读。作为 因此,我的解决方案默认为单个 2x2 结果数组。你 可以通过设置 DeepArr = True 将其更改为 3+ 维。
- 您希望将结果粘贴到同一作品中sheet。
- 您有一个单独的 supplier/vendor 代码列表,可以在零件代码中找到。 GuessSupplier 函数取决于此假设。如果需要,根据实际需求更新功能。
- 我将您的原始输入(如 Siemens;Motor;A4002)称为零件代码。
- 我假设最后 semi-colon 之后的文本将始终是该部分 数字。如果没有,您可以轻松地替换该假设 GuessPartNum 函数。
下面介绍我用来简单测试的点差sheet。
Sheet "PartCodes" 在单列中包含零件代码,在单元格 B3:B6 中包含示例值(B2 中的 header)。列 G-H 保留用于结果。
Sheet "Suppliers" 在单个列中包含唯一的供应商列表 (B3:B6)。
您可以在 RunMain() 子程序中为输入和输出指定 sheet 名称和范围。
为方便起见,我在某些地方对 sheet 名称进行了硬编码。你应该把这些作为论据浮出水面。
为了便于理解,代码有些冗长。
我没有测试性能,因为我没有数据集,希望你 运行 很少这样做。
我只添加了少量的错误处理。
我的完整代码集如下。您会在底部附近找到 RunMain() 子程序。这将启动控制工作流的 Main() sub。
Option Base 0
Option Explicit
' 1) Manually eliminate duplicates in your parts list using Excel built-in feature.
' a) highlight the range
' b) Data ribbon > Remove Duplicates
' 2) Create a supplier list on a separate sheet in teh same workbook
' 3) Edit the RunMain() procedure per your data. I assume: your part code list
' - part code list is in cells B3:B10 of the PartCodes sheet.
' - supplier list in cells b4:b6 of Suppliers sheet.
' - output goes to D2 in PartCodes sheet.
' 4) Run the RunMain() procedure simply kicks off Main.
' Main() sub does the following:
' a)Run ProcessPartCodes:
' i. load the parts codes from the worksheet into an array
' ii. run GuessPartNum and GuessSupplier and place results in the parts code array.
' b) Run FindMatches to add more to the array. Finds other part codes that may be for the same part.
' Logic is described in the function.
' c) Run ArrayToRange to paste part of the result set to the workseet. Note that
' the ourput array is more than 2 dimensions, so not all data is pasted neatly.
' I leave it to you to determine how you want to format the data for output.
'
Function RangeToArray(inputRange As Range)
'Copies values from a rectangular range to a 2D Array.
'Array is always 2D, even if data is a single column or row.
'inputRange: a rectangular range
Dim Col1 As Integer, row1 As Integer
Dim i As Integer, j As Integer
Dim rowCnt As Integer
Dim colCnt As Integer
Dim retArr() As Variant
' Size output array
rowCnt = inputRange.Rows.Count
colCnt = inputRange.Columns.Count
ReDim retArr(1 To rowCnt, 1 To colCnt) As Variant
' Load range values into array
For i = 1 To rowCnt
For j = 1 To (colCnt)
retArr(i, j) = Trim(inputRange.Cells(i, j))
Next j
Next i
' Return array
RangeToArray = retArr
End Function
Sub ArrayToRange(myArr As Variant, Target As Range)
' Copies the content of a 2D array to a Range.
' myArr must be exactly 2 dimensions
' Target is a range. If more than 1 cell, the top left cell is used.
' Copies the array to the range starting with the top left cell.
' Target Range size can be a single cell and need not match the array dimensions.
Dim r As Long, tgtRow As Long
Dim c As Long, tgtCol As Long
Dim firstRow As Long
Dim firstCol As Long
Dim lastRow As Long
Dim lastCol As Long
' Find the top left cell of the Target Range
tgtRow = Target.Row
tgtCol = Target.Column
' Set target range dimesions based on array size.
firstRow = tgtRow + LBound(myArr, 1)
firstCol = tgtCol + LBound(myArr, 2)
lastRow = tgtRow + UBound(myArr, 1)
lastCol = tgtCol + UBound(myArr, 2)
' The next row would usually work. If you get funky data, it will fail,
' so, we will use a loop instead.
' Range(Cells(firstRow, firstCol), Cells(lastRow, lastCol)) = myArr
' Loop through rows and columns, setting cell values one at a time.
For r = LBound(myArr, 1) To UBound(myArr, 1)
For c = LBound(myArr, 2) To UBound(myArr, 2)
On Error Resume Next ' Prevent one bad value from killing the entire operation.
Cells(tgtRow + r - 1, tgtCol + c) = myArr(r, c)
On Error GoTo 0
Next c
Next r
End Sub
' Not used, this is just an example
'Public Function RangeCorners(Optional MyRange As Range = Range("c2:c10"))
' TopLeft = MyRange.Cells(1)
' BottomLeft = MyRange.Cells(.Rows.Count, 1)
' TopRight = MyRange.Cells(1, .Columns.Count)
' BottomRigt = MyRange.Cells(.Cells.Count)
' RangeCorners = Array(TopLeft, TopRight, BottomLeft, BottomRight)
'End Function
Function IsInArray(stringToBeFound As String, arr As Variant) As Boolean
'Returns True if stringToBeFound is in the array (arr); else False
'This one-liner need not be in a fucntion, but makes reading code easier.
IsInArray = (UBound(Filter(arr, stringToBeFound)) > -1)
End Function
Function GuessPartNum(splitPartCode As Variant, Optional delim As String = ";")
' Find a way to determine what part of the partCode is the part number.
' Perhaps it is always last. Perhaps it always has at least 3 digits.
' Simply takes the last item from the part code. Update this logic to whatever
' makes sense for your dataset (which I could nto see when writing this).
GuessPartNum = splitPartCode(UBound(splitPartCode))
End Function
Function GuessSupplier(splitPartCode As Variant, supplierList As Variant, Optional delim As String = ";")
' Determine the supplier of this part from the partCode.
' For each supplier in the supplierList, see if the supplier name is in the partCode.
Dim i As Integer
For i = LBound(supplierList) To UBound(supplierList)
'Simply verifies if a supplier from supplierList is in the part code. Uses first match.
If (UBound(Filter(splitPartCode, supplierList(i, 1))) > -1) Then 'if arr(i) is in supplier_array
GuessSupplier = supplierList(i, 1)
Exit Function
End If
Next i
End Function
Function ProcessPartCodes(partCodeRange As Range, supplierListRange As Range, Optional delim As String = ";")
' Main ProcessPartCodes
'
' PartCodeRange: a range representing the part code list;
' must be in single column form.
' SupplierList: array of supplier names as strings
'
' Load part code array into array
Dim resultArr As Variant 'result set as array
Dim supplierList As Variant
Dim splitCode As Variant
Dim i As Integer
resultArr = RangeToArray(partCodeRange)
ReDim Preserve resultArr(LBound(resultArr) To UBound(resultArr), 0 To 4) As Variant
supplierList = RangeToArray(supplierListRange)
' Get supplier and part num from each part code
For i = LBound(resultArr) To UBound(resultArr)
If Len(resultArr(i, 0)) > 0 Then
splitCode = Split(resultArr(i, 0), delim) ' Split the partCode by delimiters, semi-colon (;)
resultArr(i, 0) = resultArr(i, 0) ' Part Code (not parsed)
resultArr(i, 1) = GuessPartNum(splitCode) ' Part Number
resultArr(i, 2) = GuessSupplier(splitCode, supplierList) ' Supplier
resultArr(i, 3) = splitCode ' Part Code (parsed)
'resultArr(i, 4) ' reserved for match information
Else
' Empty array element.
splitCode = ""
resultArr(i, 3) = Array()
End If
Next i
ProcessPartCodes = resultArr
End Function
Function CompareParts(splitPartCode1 As Variant, splitPartCode2 As Variant)
'
'
'splitPartCode1 is an array of a parsed partCode string
'splitPartCode2 is an array of a parsed partCode string
Dim matches() As String
Dim i As Integer
Dim matchCnt As String
ReDim matches(0 To 0) As String
' Check each item in arr1 (each substring of partCode1) for a match in arr2
For i = LBound(splitPartCode1) To UBound(splitPartCode1)
If (UBound(Filter(splitPartCode2, splitPartCode1(i))) > -1) Then 'if arr1(i) is in arr2
' Found an item in splitPartCode1 (a substring in partCode1) that is also in splitPartCode2.
' Add this item to the list of matches.
If LBound(matches) = -1 Then
ReDim matches(0 To 0) As String
Else
ReDim Preserve matches(LBound(matches) To UBound(matches) + 1) As String ' grow the matches array by one
End If
matches(UBound(matches)) = splitPartCode1(i) ' set value of last item in matches() = this item (this substring of partCode1)
End If
Next i
matchCnt = UBound(matches) - LBound(matches) + 1 ' Total number of matching substrings from each part.
CompareParts = Array(matchCnt, matches)
End Function
Function FindMatches(partCodeArr As Variant, Optional DeepArr As Boolean = False)
' Fucntion compares 2 part numbers to determine likelihood of a match.
' Parses partCode1 and partCode2 using the delimiter into arrays of strings.
' Then counts the number of matching strings in each array.
' Then determines if the part numbers (assumed to be the last string of each array) match.
' After running this, you can use the match count (matchCnt integer) and part number match
' (partNumMatch boolean) as a basis for determining how likely it is that partCode1=partCode2.
'
'
' DeepArr: If True, returns 3+ dimensional array. If False, flattens results to 2D array.
'
' Returns: Array(partCode1, partCode2, partNum1, partNum2, matchCnt, pricePct, supplierMatch, partNumMatch)
' partCode1 = partCode1 input argument
' partCode2 = partCode2 input argument
' partNum1 = the portion (substring) of partCode1 after the last ocurrence of the delimiter, delim.
' partNum2 = the portion (substring) of partCode2 after the last ocurrence of the delimiter, delim.
' match (boolean) = True if parts are likely the same.
' matchCnt = number of matching sub-strings between partCode1 and part 2
' (essentially, a match score, where higher is more likely a positive match)
' Returns -1 if partCode1=partCode2, meaning exact match.
' pricePct = percentage price match calculated as (decimal portion of price1/price2) * 100
' partNumMatch = True is partNum1=partNum2; else False
Dim i As Integer, j As Integer, k As Integer
Dim partCodei, partCodej
Dim partNumi As String, partNumj As String, numMatch As Boolean
Dim Duplicate As Boolean, newMatch As Boolean
Dim partSupplieri As String, partSupplierj As String, supplierMatch As Boolean
Dim splitCodei() As String, splitCodej() As String, matchCnt As Integer
Dim splitCompare
Dim matches() As String 'empty array has LBound=0 and UBound=-1, so UBound-LBound=-1 indicates an empty array
Dim matchstr As String
Dim s As String
matchCnt = 0 ' matchCnt = UBound(matches) - LBound(matches) + 1 ' starting with 0 matches.
For i = LBound(partCodeArr) To UBound(partCodeArr)
If i = 1 Or i = UBound(partCodeArr) Or i Mod 100 = 0 _
Then Debug.Print "Starting record " & i & ": " & Now()
If partCodeArr(i, 0) <> "" Then
matchstr = ""
For j = i + 1 To UBound(partCodeArr)
If Len(partCodeArr(j, 0)) > 0 Then
partCodei = partCodeArr(i, 0)
partCodej = partCodeArr(j, 0)
Duplicate = partCodei = partCodej 'found duplicate entry in table.
partNumi = partCodeArr(i, 1)
partNumj = partCodeArr(j, 1)
numMatch = partNumi = partNumj
partSupplieri = partCodeArr(i, 2)
partSupplierj = partCodeArr(j, 2)
supplierMatch = partSupplieri = partSupplierj
splitCodei = partCodeArr(i, 3)
splitCodej = partCodeArr(j, 3)
splitCompare = CompareParts(splitCodei, splitCodej)
matchCnt = splitCompare(0)
newMatch = False
If Duplicate Then
' You should have removed duplicates before starting.
On Error GoTo redimErr
ReDim Preserve matches(0 To UBound(matches) + 1, 0 To 2) As String
On Error GoTo 0
newMatch = True
matches(UBound(matches), 0) = partCodej 'The duplicate partCode
matches(UBound(matches), 1) = "0" ' Matching score, where -1 indicates an exact duplicate.
matches(UBound(matches), 2) = "Duplicate Entry. Part codes are identical." ' Matching score, where -1 indicates an exact duplicate.
ElseIf supplierMatch And numMatch Then
' Possible duplicate part since supplier and part number both match.
On Error GoTo redimErr
ReDim Preserve matches(0 To UBound(matches) + 1, 0 To 2) As String
On Error GoTo 0
newMatch = True
matches(UBound(matches), 0) = partCodej 'The duplicate partCode
matches(UBound(matches), 1) = "1" ' Matching score, where -1 indicates an exact duplicate.
matches(UBound(matches), 2) = "Probably same part with differnt part code. Same supplier and part number." ' Matching score, where -1 indicates an exact duplicate.
ElseIf supplierMatch And matchCnt > 2 Then
' Possible duplicate part since supplier and part number both match.
On Error GoTo redimErr
ReDim Preserve matches(0 To UBound(matches) + 1, 0 To 2) As String
On Error GoTo 0
newMatch = True
matches(UBound(matches), 0) = partCodej 'The duplicate partCode
matches(UBound(matches), 1) = "2" ' Matching score, where -1 indicates an exact duplicate.
matches(UBound(matches), 2) = "Possible duplicate. More likely a similar part from same supplier" ' Matching score, where -1 indicates an exact duplicate.
ElseIf supplierMatch = False And matchCnt > 2 Then
' Possible duplicate part since supplier and part number both match.
On Error GoTo redimErr
ReDim Preserve matches(0 To UBound(matches) + 1, 0 To 2) As String
On Error GoTo 0
newMatch = True
matches(UBound(matches), 0) = partCodej 'The duplicate partCode
matches(UBound(matches), 1) = "3" ' Matching score, where -1 indicates an exact duplicate.
matches(UBound(matches), 2) = "Possible part match from different supplier" ' Matching score, where -1 indicates an exact duplicate.
ElseIf supplierMatch = False And matchCnt > 1 Then
' Possible duplicate part since supplier and part number both match.
On Error GoTo redimErr
ReDim Preserve matches(0 To UBound(matches) + 1, 0 To 2) As String
On Error GoTo 0
newMatch = True
matches(UBound(matches), 0) = partCodej 'The duplicate partCode
matches(UBound(matches), 1) = "4" ' Matching score, where -1 indicates an exact duplicate.
matches(UBound(matches), 2) = "Low probability part match from different supplier" ' Matching score, where -1 indicates an exact duplicate.
End If
If newMatch And Not DeepArr Then
For k = LBound(matches) To UBound(matches)
matchstr = matchstr & "[" & partCodej & "," & matches(UBound(matches), 1) & "," & matches(UBound(matches), 2) & "], "
Next k
End If
End If
Next j
If DeepArr Then
' return 3+ dimensional array
partCodeArr(i, 4) = matches
Else
' return 2D array for easier pasting to worksheet
' Flatten partCodeArr(i, 4), the parsed potential part matches to an ordinary string
' with format [[part code, match value, match description],[part code, match value, match description],...]
If Len(matchstr) > 0 Then
matchstr = "[ " & Left(matchstr, Len(matchstr) - 2) & "] "
End If
partCodeArr(i, 4) = matchstr
' Flatten the parsed part code back to original string format.
partCodeArr(i, 3) = partCodeArr(i, 0)
End If
ReDim matches(0) As String
End If
Next i
FindMatches = partCodeArr
Exit Function
redimErr:
ReDim matches(0 To 0, 0 To 2) As String
Resume Next
End Function
Sub RunMain()
' Kicks off Main(partCodeRange As Range, supplierListRange As Range, destination As Range)
'
' Arguments:
' partCodeRange = Excel Range (not string name of range)
' that contains the raw part code list
' supplierListRange = Excel Range (not string name of range)
' that contains a unique list of supplier
' codes found in the part codes.
'
Call Main(Sheets("PartCodes").Range("B3:B10"), Sheets("Suppliers").Range("B4:B6"), Range("PartCodes!D2"))
End Sub
Sub Main(partCodeRange As Range, supplierListRange As Range, destination As Range)
' This is the main sub that runs the full process of finding equivalent part
' codes and writing the findings to an excel worksheet.
' See RunMain() sub for example use.
'
' Arguments:
' partCodeRange = Excel Range (not string name of range)
' that contains the raw part code list
' supplierListRange = Excel Range (not string name of range)
' that contains a unique list of supplier
' codes found in the part codes.
'
Dim partCodesArr, matchArr
Dim startdate As Date, stopdate As Date
startdate = Now()
Debug.Print
Debug.Print String(70, "*")
Debug.Print
Debug.Print "Starting: " & startdate
Debug.Print
partCodesArr = ProcessPartCodes(partCodeRange, supplierListRange)
matchArr = FindMatches(partCodesArr) ' FindMatches(partCodesArr, True) for 3+ dimensional results
Sheets("PartCodes").Activate
'Write column headers.
destination.Offset(0, 0) = "Part Code"
destination.Offset(0, 1) = "Part Num"
destination.Offset(0, 2) = "Part Supplier"
destination.Offset(0, 3) = "Part Code"
destination.Offset(0, 4) = "Potential equivalent part numbers"
Call ArrayToRange(matchArr, destination.Offset(1, 0))
stopdate = Now()
Debug.Print
Debug.Print "Finished: " & stopdate
Debug.Print
Debug.Print "Run time: " & (stopdate - startdate)
Debug.Print
Debug.Print String(70, "*")
Debug.Print
End Sub