匹配多列时查找重复项
Finding duplicates when matching multiple columns
@alon adler 昨天帮助我创建了一些 VBA 代码以从 Excel sheet 中删除重复行,我们正在询问的单元格具有某种颜色,并且在同一列中有另一个具有相同值的单元格。
我现在需要调整代码,以匹配行中的多列,而不仅仅是一列。
他的代码如下:
Sub sbFindDuplicatesInColumn_With_Color_Condition()
Dim toDel(), i As Long
Dim RNG As Range, Cell As Long
'Declare and set the worksheet where your data is stored
Dim sheet As Worksheet
Set sheet = Worksheets("Sheet1")
'Finding the last row in the Column 1
lastRow = sheet.Cells(sheet.Rows.Count, 1).End(xlUp).Row
'Set the range to the last row of data in the column
Set RNG = Range("a1:a" & lastRow) 'set your range here
'Iterate over the column, finding duplicates and store their address in an array
For Cell = 1 To RNG.Cells.Count
If Application.CountIf(RNG, RNG(Cell)) > 1 Then
ReDim Preserve toDel(i)
toDel(i) = RNG(Cell).Address
i = i + 1
End If
Next
'Iterate over the array and remove duplicates with specific color index (in this example - remove the yellow ones)
For i = UBound(toDel) To LBound(toDel) Step -1
If Range(toDel(i)).Cells.Interior.ColorIndex = 6 Then
Range(toDel(i)).Cells.Value = ""
End If
Next i
End Sub
我真的不知道 VBA,想知道一个有效的方法(spreadsheet 有大约 45,000 行要检查),调整代码以检查一系列列都与另一行匹配。
例如,我们要确保要将一行视为重复行,A 到 O 列中的所有单元格都必须与另一行中相应列中的值相匹配。
然后我们需要删除该行(如果它不是白色的)。我确定删除命令是:
Rows(RowToDel).EntireRow.Delete
我相信我会改变:
If Range(toDel(i)).Cells.Interior.ColorIndex = 6 Then
收件人:
If Range(toDel(i)).Cells.Interior.ColorIndex <> 0 Then
如果有人可以协助匹配多个列,我想我会很高兴。
尝试以下操作。对于支持图像中显示的数据,第 2 行将被删除,因为它是重复的,并且它的内部不像 -4142,即有填充。
代码进入标准模块。
您可以un-comment删除代码来执行删除
unionRng.EntireRow.Delete
并注释掉
Debug.Print unionRng.EntireRow.Address
当前显示的是将被删除的内容。
是的,它有点乱,可以使用一些 re-factoring,例如,您可以使用 Join 将 Evaluate(CONCATENATE...) 字符串缩短为更短的字符串。
代码:
Option Explicit
Public Sub sbFindDuplicatesInColumn_With_Color_Condition()
Dim RNG As Range
Dim wb As Workbook
Dim currentRow As Long
Dim targetSheet As Worksheet
Dim lastRow As Long
Set wb = ThisWorkbook
Set targetSheet = wb.Worksheets("Sheet1")
With targetSheet
lastRow = .Cells(.Rows.Count, 1).End(xlUp).Row 'find the last row in column A and use this to determine the number of rows in range to work with
Set RNG = .Range("A1:O" & lastRow) 'set your range here
Dim toDel()
toDel = RNG.Value2
ReDim Preserve toDel(1 To UBound(toDel, 1), 1 To UBound(toDel, 2) + 2)
Dim concatValuesDict As Scripting.Dictionary 'As Object
Set concatValuesDict = New Scripting.Dictionary ' = CreateObject("Scripting.Dictionary")
For currentRow = LBound(toDel, 1) To UBound(toDel, 1)
Dim joinedString As String
joinedString = Evaluate("CONCATENATE(""" & toDel(currentRow, 1) & """,""" & toDel(currentRow, 2) & toDel(currentRow, 3) & """,""" _
& toDel(currentRow, 4) & """,""" & toDel(currentRow, 5) & """,""" & toDel(currentRow, 6) & """,""" & _
toDel(currentRow, 7) & """,""" & toDel(currentRow, 8) & toDel(currentRow, 9) & """,""" & toDel(currentRow, 10) & """,""" & _
toDel(currentRow, 11) & """,""" & toDel(currentRow, 12) & """,""" & toDel(currentRow, 13) & """,""" & toDel(currentRow, 14) & """,""" & _
toDel(currentRow, 15) & """)") 'create a unique key for each row to determine duplicates by concatenating each column in the range
toDel(currentRow, UBound(toDel, 2) - 1) = joinedString
If Not concatValuesDict.Exists(joinedString) Then
concatValuesDict.Add joinedString, False 'add these "keys" to a dictionary, and if "key" not already present then associated dictionary value =False
Else
concatValuesDict(joinedString) = True 'key seen before so duplicate so set value to True
End If
Next currentRow
Dim unionRng As Range
For currentRow = LBound(toDel, 1) To UBound(toDel, 1)
toDel(currentRow, UBound(toDel, 2)) = concatValuesDict(toDel(currentRow, UBound(toDel, 2) - 1))
If toDel(currentRow, UBound(toDel, 2)) And targetSheet.Rows(currentRow).Interior.ColorIndex <> -4142 Then 'Some fill is applied and the dictionary value for this row is True (i.e. is a duplicate)
If Not unionRng Is Nothing Then
Set unionRng = Union(unionRng, targetSheet.Rows(currentRow).Cells(1, 1)) 'add the key to a range for later deletion using union
Else
Set unionRng = targetSheet.Rows(currentRow).Cells(1, 1)
End If
End If
Next currentRow
If Not unionRng Is Nothing Then
Debug.Print unionRng.EntireRow.Address
' unionRng.EntireRow.Delete
End If
End With
End Sub
数据图片:
示例运行:
@alon adler 昨天帮助我创建了一些 VBA 代码以从 Excel sheet 中删除重复行,我们正在询问的单元格具有某种颜色,并且在同一列中有另一个具有相同值的单元格。
我现在需要调整代码,以匹配行中的多列,而不仅仅是一列。 他的代码如下:
Sub sbFindDuplicatesInColumn_With_Color_Condition()
Dim toDel(), i As Long
Dim RNG As Range, Cell As Long
'Declare and set the worksheet where your data is stored
Dim sheet As Worksheet
Set sheet = Worksheets("Sheet1")
'Finding the last row in the Column 1
lastRow = sheet.Cells(sheet.Rows.Count, 1).End(xlUp).Row
'Set the range to the last row of data in the column
Set RNG = Range("a1:a" & lastRow) 'set your range here
'Iterate over the column, finding duplicates and store their address in an array
For Cell = 1 To RNG.Cells.Count
If Application.CountIf(RNG, RNG(Cell)) > 1 Then
ReDim Preserve toDel(i)
toDel(i) = RNG(Cell).Address
i = i + 1
End If
Next
'Iterate over the array and remove duplicates with specific color index (in this example - remove the yellow ones)
For i = UBound(toDel) To LBound(toDel) Step -1
If Range(toDel(i)).Cells.Interior.ColorIndex = 6 Then
Range(toDel(i)).Cells.Value = ""
End If
Next i
End Sub
我真的不知道 VBA,想知道一个有效的方法(spreadsheet 有大约 45,000 行要检查),调整代码以检查一系列列都与另一行匹配。
例如,我们要确保要将一行视为重复行,A 到 O 列中的所有单元格都必须与另一行中相应列中的值相匹配。
然后我们需要删除该行(如果它不是白色的)。我确定删除命令是:
Rows(RowToDel).EntireRow.Delete
我相信我会改变:
If Range(toDel(i)).Cells.Interior.ColorIndex = 6 Then
收件人:
If Range(toDel(i)).Cells.Interior.ColorIndex <> 0 Then
如果有人可以协助匹配多个列,我想我会很高兴。
尝试以下操作。对于支持图像中显示的数据,第 2 行将被删除,因为它是重复的,并且它的内部不像 -4142,即有填充。
代码进入标准模块。
您可以un-comment删除代码来执行删除
unionRng.EntireRow.Delete
并注释掉
Debug.Print unionRng.EntireRow.Address
当前显示的是将被删除的内容。
是的,它有点乱,可以使用一些 re-factoring,例如,您可以使用 Join 将 Evaluate(CONCATENATE...) 字符串缩短为更短的字符串。
代码:
Option Explicit
Public Sub sbFindDuplicatesInColumn_With_Color_Condition()
Dim RNG As Range
Dim wb As Workbook
Dim currentRow As Long
Dim targetSheet As Worksheet
Dim lastRow As Long
Set wb = ThisWorkbook
Set targetSheet = wb.Worksheets("Sheet1")
With targetSheet
lastRow = .Cells(.Rows.Count, 1).End(xlUp).Row 'find the last row in column A and use this to determine the number of rows in range to work with
Set RNG = .Range("A1:O" & lastRow) 'set your range here
Dim toDel()
toDel = RNG.Value2
ReDim Preserve toDel(1 To UBound(toDel, 1), 1 To UBound(toDel, 2) + 2)
Dim concatValuesDict As Scripting.Dictionary 'As Object
Set concatValuesDict = New Scripting.Dictionary ' = CreateObject("Scripting.Dictionary")
For currentRow = LBound(toDel, 1) To UBound(toDel, 1)
Dim joinedString As String
joinedString = Evaluate("CONCATENATE(""" & toDel(currentRow, 1) & """,""" & toDel(currentRow, 2) & toDel(currentRow, 3) & """,""" _
& toDel(currentRow, 4) & """,""" & toDel(currentRow, 5) & """,""" & toDel(currentRow, 6) & """,""" & _
toDel(currentRow, 7) & """,""" & toDel(currentRow, 8) & toDel(currentRow, 9) & """,""" & toDel(currentRow, 10) & """,""" & _
toDel(currentRow, 11) & """,""" & toDel(currentRow, 12) & """,""" & toDel(currentRow, 13) & """,""" & toDel(currentRow, 14) & """,""" & _
toDel(currentRow, 15) & """)") 'create a unique key for each row to determine duplicates by concatenating each column in the range
toDel(currentRow, UBound(toDel, 2) - 1) = joinedString
If Not concatValuesDict.Exists(joinedString) Then
concatValuesDict.Add joinedString, False 'add these "keys" to a dictionary, and if "key" not already present then associated dictionary value =False
Else
concatValuesDict(joinedString) = True 'key seen before so duplicate so set value to True
End If
Next currentRow
Dim unionRng As Range
For currentRow = LBound(toDel, 1) To UBound(toDel, 1)
toDel(currentRow, UBound(toDel, 2)) = concatValuesDict(toDel(currentRow, UBound(toDel, 2) - 1))
If toDel(currentRow, UBound(toDel, 2)) And targetSheet.Rows(currentRow).Interior.ColorIndex <> -4142 Then 'Some fill is applied and the dictionary value for this row is True (i.e. is a duplicate)
If Not unionRng Is Nothing Then
Set unionRng = Union(unionRng, targetSheet.Rows(currentRow).Cells(1, 1)) 'add the key to a range for later deletion using union
Else
Set unionRng = targetSheet.Rows(currentRow).Cells(1, 1)
End If
End If
Next currentRow
If Not unionRng Is Nothing Then
Debug.Print unionRng.EntireRow.Address
' unionRng.EntireRow.Delete
End If
End With
End Sub
数据图片:
示例运行: