Excel 数据验证协助

Excel Data Verification Assistance

我面临着一个有趣的数据验证挑战,每个月我的经理都会对多达 2 万行数据(这完全是手动的)执行数据验证 activity,她会在其中查找重复值列 (EventID),如果存在重复值,则比较另一列 (subType) 的值(如果值不相同)。这两行都被复制到不同的 sheet 以进行辅助处理。

我仍在学习 VBA,所以我认为这是一个成长过程 activity,因为这会让我的经理在她的日子里有很多时间。

我有目前正在处理的代码,但我发现我开始走错路了,有人能帮忙吗?

Sub Find_changes ()

Dim eventID As Range
Dim subtype As Range
Dim cell As Range
Dim LastRow As Long
Dim Reader As Worksheet
Dim Writer As Worksheet

Set Reader = ThisWorkbook.Worksheets(2)
Set Writer = ThisWorkbook.Worksheets(3)
Set eventID = Reader.Range("b:b")
Set subtype = Reader.Range("j:j")

Let LastRow = Writer.Cells(Rows.Count, 1).End(xlUp).Row + 1

For Each cell In eventID
    If eventID = eventID And subtype <> subtype Then
        cell.EntireRow.Copy Destination:=Writer.Range(LastRow)
                
    End If
 Next
    
End Sub

我还在下面包含了一个模拟数据集:

Mock Data Set

理想情况下我想要实现的(以图片为例)是:

对于大量单元格,将范围存储为数组并在 VBA 内存中使用它们通常比直接在 Excel 中使用它们更有好处。您会发现在 VBA 内存中处理数据比直接处理 Excel 快得多。

下面是一段代码(非常需要您进行修改才能使用),它将引导您朝着正确的方向前进。我在代码片段下方添加了注释来解释方法,并指出您还需要做些什么才能使这项工作为您所用。

Option Explicit

Sub Find_changes_modified()

    ' Reference the sheet with the data
    Dim Reader As Worksheet
    Set Reader = ThisWorkbook.Worksheets(2)

    ' Store the entire dataset as a range
    Dim RangeReader As Range
    Set RangeReader = Reader.Range("A1:J6") ' ***
    
    ' Sort the range from lowest to highest EventID, and lowest to highest Report Subtype
    ' Sorting the range allows us to compare EventIDs that are next to one another
    With Reader.Sort
        .SortFields.Clear
        .SortFields.Add2 Key:=Range("B2:B6"), Order:=xlAscending ' ***
        .SortFields.Add2 Key:=Range("J2:J6"), Order:=xlAscending ' ***
        .SetRange RangeReader
        .Header = xlYes
        .Apply
    End With

    ' Store the entire range as an array
    Dim ArrayReader() As Variant
    ArrayReader = RangeReader.Value
    
    ' Column numbers of columns in the array
    Dim ColumnNumberEventID As Long, ColumnNumberSubtype As Long
    ColumnNumberEventID = 2 ' ***
    ColumnNumberSubtype = 10 ' ***
    
    ' Store all duplicates in another array
    ' Make ArrayWriter the same size as ArrayReader
    Dim ArrayWriter() As Variant
    ReDim ArrayWriter(1 To UBound(ArrayReader, 1), 1 To UBound(ArrayReader, 2))
    
    Dim Duplicates As Long
    Duplicates = 0
    
    ' Iterate through the array and check for duplicates in the EventID column
    Dim ii As Long, jj As Long
    Dim ThisEventID As String, NextEventID As String, ThisSubType As String, NextSubType As String
    For ii = LBound(ArrayReader, 1) To UBound(ArrayReader, 1) - 1
    
        ThisEventID = ArrayReader(ii, ColumnNumberEventID)
        NextEventID = ArrayReader(ii + 1, ColumnNumberEventID)
    
        If ThisEventID = NextEventID Then
        
            ThisSubType = ArrayReader(ii, ColumnNumberSubtype)
            NextSubType = ArrayReader(ii + 1, ColumnNumberSubtype)
        
            If ThisSubType <> NextSubType Then
            
                Duplicates = Duplicates + 1
            
                ' Copy all of the row's information to the ArrayWriter
                For jj = LBound(ArrayReader, 2) To UBound(ArrayReader, 2)
                
                    ArrayWriter(Duplicates, jj) = ArrayReader(ii, jj)
                
                Next jj
            
            End If
        
        End If
    
    Next ii
    
    ' Reference the sheet to write the duplicate data
    Dim Writer As Worksheet
    Set Writer = ThisWorkbook.Worksheets(3)
    
    ' Write the contents of the ArrayWriter to the other sheet
    Writer.Range("A1:J1").Value = ArrayWriter
    
End Sub

大图是根据 2 列(事件 ID 和报告子类型)对数据进行排序,然后将数据 row-by-row 与其相邻数据进行比较。这种排序意味着我们只需要将每一行数据与其相邻数据进行比较,而不是每次都将多行与多行进行比较。

此代码中所有带有注释的地方*** 表示实际使用时需要更改的内容。我使用了许多硬编码值来说明它是如何工作的。

我们首先使用数据 Reader 创建对 sheet 的引用,就像您最初所做的那样。接下来,我们将数据存储在一个范围内,RangeReader。使用此范围,我们根据事件 ID 列对数据进行排序,然后根据报告子类型列对数据进行排序。

范围排序后,我们将其存储为数组 ArrayReader,因此我们可以在 VBA 内存中使用它。我们还需要一个数组来存储我们遇到的任何重复数据行,ArrayWriter。然后,遍历 ArrayReader 的内容并将每一行与其相邻行进行比较。如果一行及其邻居满足条件,则将其添加到 ArrayWriter。完成所有这些后,将 ArrayWriter 信息写入 sheet.

您需要考虑的一些事项:

  • 你能找到一种编程方式来确定 RangeReader 的单元格,而不是仅仅将它们手动输入代码吗?
  • 你能找到一种编程方法来确定用于排序的每个范围吗RangeReader
  • ColumnNumberEventIDColumnNumberSubType 的值总是分别为 2 和 10 吗?如果不是,您如何确保您的代码始终引用正确的列?
  • 当此代码找到重复项时,它只存储其中一项。您可能还想存储重复的其他项目。

请尝试下一种方法,对于更大的范围应该足够快。它使用脚本 Dictionary 来查找重复项,并根据创建的范围(使用 SpecialCells(xlCellTypeConstants))在使用范围之外的列中放置一个标记,该范围将在最后立即复制:

Sub Find_changes()
 Dim Reader As Worksheet, LastRow As Long, Writer As Worksheet, rngCopy As Range
 Dim arrID, arrType, arrMark, i As Long, maxCol As Long, maxRow As Long, dict As Object

 Set Reader = ThisWorkbook.Worksheets(2)
 Set Writer = ThisWorkbook.Worksheets(3)
 LastRow = Writer.Range("B" & Writer.rows.count).End(xlUp).Row + 1

 maxRow = Reader.Range("B" & Reader.rows.count).End(xlUp).Row
 maxCol = Reader.UsedRange.Columns.count + 20

 arrID = Reader.Range("B1:B" & maxRow).value    'plase the range in an array for faster iteration
 arrType = Reader.Range("J1:J" & maxRow).value 'plase the range in an array for faster iteration
 ReDim arrMark(1 To maxRow, 1 To 1)

 Set dict = CreateObject("Scripting.Dictionary")
 'Place the duplicates in a dictionary and mark their rows:
 For i = 1 To maxRow
    If Not dict.Exists(arrID(i, 1)) Then
        dict.Add arrID(i, 1), Array(arrType(i, 1), i)
    Else
        If dict(arrID(i, 1))(0) <> arrType(i, 1) Then
            arrMark(i, 1) = "Copy"                         'write "Copy" in arrMark
            arrMark(dict(arrID(i, 1))(1), 1) = "Copy" 'write "Copy" in arrMark
        End If
    End If
 Next

 'drop the marker array after the last column:
 Reader.cells(1, maxCol).Resize(UBound(arrMark), 1).value = arrMark

 'Extract the range to be copied (at once)
 On Error Resume Next
  Set rngCopy = Reader.Range(Reader.cells(1, maxCol), Reader.cells(maxRow, maxCol)).SpecialCells(xlCellTypeConstants)
 On Error GoTo 0
 If Not rngCopy Is Nothing Then
    rngCopy.ClearContents
    rngCopy.EntireRow.Copy Writer.Range("A" & LastRow)
 End If
 MsgBox "Ready..."
End Sub

我试着评论代码行,但如果有什么地方不够清楚,请不要犹豫,要求澄清。