匹配多列时查找重复项

Finding duplicates when matching multiple columns

@alon adler 昨天帮助我创建了一些 VBA 代码以从 Excel sheet 中删除重复行,我们正在询问的单元格具有某种颜色,并且在同一列中有另一个具有相同值的单元格。

我现在需要调整代码,以匹配行中的多列,而不仅仅是一列。 他的代码如下:

Sub sbFindDuplicatesInColumn_With_Color_Condition()

Dim toDel(), i As Long
Dim RNG As Range, Cell As Long

'Declare and set the worksheet where your data is stored
Dim sheet As Worksheet
Set sheet = Worksheets("Sheet1")

'Finding the last row in the Column 1
lastRow = sheet.Cells(sheet.Rows.Count, 1).End(xlUp).Row

'Set the range to the last row of data in the column
Set RNG = Range("a1:a" & lastRow) 'set your range here

'Iterate over the column, finding duplicates and store their address in an array
For Cell = 1 To RNG.Cells.Count
    If Application.CountIf(RNG, RNG(Cell)) > 1 Then
        ReDim Preserve toDel(i)
        toDel(i) = RNG(Cell).Address
        i = i + 1
    End If
Next
'Iterate over the array and remove duplicates with specific color index (in this example - remove the yellow ones)
For i = UBound(toDel) To LBound(toDel) Step -1
    If Range(toDel(i)).Cells.Interior.ColorIndex = 6 Then
        Range(toDel(i)).Cells.Value = ""
    End If
Next i

End Sub

我真的不知道 VBA,想知道一个有效的方法(spreadsheet 有大约 45,000 行要检查),调整代码以检查一系列列都与另一行匹配。

例如,我们要确保要将一行视为重复行,A 到 O 列中的所有单元格都必须与另一行中相应列中的值相匹配。

然后我们需要删除该行(如果它不是白色的)。我确定删除命令是:

Rows(RowToDel).EntireRow.Delete

我相信我会改变:

If Range(toDel(i)).Cells.Interior.ColorIndex = 6 Then

收件人:

If Range(toDel(i)).Cells.Interior.ColorIndex <> 0 Then

如果有人可以协助匹配多个列,我想我会很高兴。

尝试以下操作。对于支持图像中显示的数据,第 2 行将被删除,因为它是重复的,并且它的内部不像 -4142,即有填充。

代码进入标准模块。

您可以un-comment删除代码来执行删除

unionRng.EntireRow.Delete

并注释掉

Debug.Print unionRng.EntireRow.Address 

当前显示的是将被删除的内容。

是的,它有点乱,可以使用一些 re-factoring,例如,您可以使用 Join 将 Evaluate(CONCATENATE...) 字符串缩短为更短的字符串。

代码:

Option Explicit

Public Sub sbFindDuplicatesInColumn_With_Color_Condition()

    Dim RNG As Range
    Dim wb As Workbook
    Dim currentRow As Long

    Dim targetSheet As Worksheet
    Dim lastRow As Long

    Set wb = ThisWorkbook
    Set targetSheet = wb.Worksheets("Sheet1")

    With targetSheet

        lastRow = .Cells(.Rows.Count, 1).End(xlUp).Row 'find the last row in column A and use this to determine the number of rows in range to work with

        Set RNG = .Range("A1:O" & lastRow)       'set your range here

        Dim toDel()
        toDel = RNG.Value2
        ReDim Preserve toDel(1 To UBound(toDel, 1), 1 To UBound(toDel, 2) + 2)

        Dim concatValuesDict As Scripting.Dictionary 'As Object
        Set concatValuesDict = New Scripting.Dictionary ' = CreateObject("Scripting.Dictionary")

        For currentRow = LBound(toDel, 1) To UBound(toDel, 1)

            Dim joinedString As String

            joinedString = Evaluate("CONCATENATE(""" & toDel(currentRow, 1) & """,""" & toDel(currentRow, 2) & toDel(currentRow, 3) & """,""" _
            & toDel(currentRow, 4) & """,""" & toDel(currentRow, 5) & """,""" & toDel(currentRow, 6) & """,""" & _
            toDel(currentRow, 7) & """,""" & toDel(currentRow, 8) & toDel(currentRow, 9) & """,""" & toDel(currentRow, 10) & """,""" & _
            toDel(currentRow, 11) & """,""" & toDel(currentRow, 12) & """,""" & toDel(currentRow, 13) & """,""" & toDel(currentRow, 14) & """,""" & _
           toDel(currentRow, 15) & """)") 'create a unique key for each row to determine duplicates by concatenating each column in the range

           toDel(currentRow, UBound(toDel, 2) - 1) = joinedString

           If Not concatValuesDict.Exists(joinedString) Then
               concatValuesDict.Add joinedString, False 'add these "keys" to a dictionary, and if "key" not already present then associated dictionary value =False
           Else
              concatValuesDict(joinedString) = True 'key seen before so duplicate so set value to True
           End If

        Next currentRow

        Dim unionRng As Range

        For currentRow = LBound(toDel, 1) To UBound(toDel, 1)

           toDel(currentRow, UBound(toDel, 2)) = concatValuesDict(toDel(currentRow, UBound(toDel, 2) - 1))

         If toDel(currentRow, UBound(toDel, 2)) And targetSheet.Rows(currentRow).Interior.ColorIndex <> -4142 Then  'Some fill is applied and the dictionary value for this row is True (i.e. is a duplicate)

             If Not unionRng Is Nothing Then
                Set unionRng = Union(unionRng, targetSheet.Rows(currentRow).Cells(1, 1))   'add the key to a range for later deletion using union 
             Else
                Set unionRng = targetSheet.Rows(currentRow).Cells(1, 1)
             End If
         End If

        Next currentRow

        If Not unionRng Is Nothing Then

            Debug.Print unionRng.EntireRow.Address
            ' unionRng.EntireRow.Delete

        End If

    End With

End Sub

数据图片:

示例运行: