vba 用于比较和复制粘贴超过 10,000 个值的高效循环

vba efficient loop to compare and do copy and paste for more than 10,000 values

我想比较 2 个时间值,如果它们匹配,则粘贴当时的温度值,如果在特定时间点缺少一个测量值,则分配 0。此代码目前适用于1000 个值(不到 1 分钟),但是 10,000 个值需要一个多小时。如何减少循环时间?

Sub findMatching()
Dim CurrentLine As Integer, CurrentLine2 As Integer, CurrentLine3 As Integer
Dim MaxRows As Integer, MaxRows2 As Integer

MaxRows = 1000
MaxRows2 = 1000
CurrentLine = 1
For CurrentLine = 1 To MaxRows '-- Loop in A column (read data)
    For CurrentLine2 = 1 To MaxRows2 '-- Loop in D column (compare data)
      If Sheets(1).Cells(CurrentLine, 1) = Sheets(1).Cells(CurrentLine2,4) Then
      '-- copying matching data
    Sheets(1).Cells(CurrentLine, 2) = Sheets(1).Cells(CurrentLine2, 5)
    CurrentLine = CurrentLine + 1
    ElseIf Sheets(1).Cells(CurrentLine, 1) <> Sheets(1).Cells(CurrentLine2,4) Then
      Sheets(1).Cells(CurrentLine, 2) = 0
    End If
   Next CurrentLine2
 Next CurrentLine
End Sub

下面的代码依赖于您能够访问 Scripting.Dictionary 对象。我使用后期绑定,因此您不需要添加引用。

你说 Range.Resize 害死你了。不太清楚为什么会这样,但我在下面的代码中再次使用了它。如果您有性能问题,请告诉我。

Option Explicit

Private Sub findFirstMatching()

    ' Declared two constants because OP had done it that way in their post.
    ' Depending on use case, could get rid of second and just use the one
    ' But having two allows you to change one without the other.
    Const READ_ROW_COUNT As Long = 10000 ' Used for columns A, B
    Const COMPARISON_ROW_COUNT As Long = 10000 ' Used for columns D, E

    ' Change sheet name below to wherever the data is. I assume Sheet1 '
    With ThisWorkbook.Worksheets("Sheet1")

        Dim columnA() As Variant
        columnA = .Range("A1").Resize(READ_ROW_COUNT, 1).Value2

        Dim columnD() As Variant
        columnD = .Range("D1").Resize(COMPARISON_ROW_COUNT, 1).Value2

        Dim columnE() As Variant
        columnE = .Range("E1").Resize(COMPARISON_ROW_COUNT, 1).Value2

        Dim dict As Object
        Set dict = CreateObject("Scripting.Dictionary")

        Dim rowIndex As Long

        ' Fill dictionary (array values as dictionary's keys, row numbers as dictionary's corresponding values)
        ' If there are duplicates in column D, the dictionary will only contain/return the row number of the FIRST instance/match
        For rowIndex = LBound(columnD, 1) To UBound(columnD, 1)
            If Not dict.Exists(columnD(rowIndex, 1)) Then
                dict.Add columnD(rowIndex, 1), rowIndex
            End If
        Next rowIndex

        Dim outputArray() As Variant
        ReDim outputArray(1 To READ_ROW_COUNT, 1 To 1)
        Dim rowIndexOfFirstMatch As Long

        ' Now loop through column A's values and check if it exists in dict
        For rowIndex = LBound(columnA, 1) To UBound(columnA, 1)
            If dict.Exists(columnA(rowIndex, 1)) Then
                rowIndexOfFirstMatch = dict.Item(columnA(rowIndex, 1))
                outputArray(rowIndex, 1) = columnE(rowIndexOfFirstMatch, 1)
            Else
                outputArray(rowIndex, 1) = "#N/A" ' Change to zero if needed.
            End If
        Next rowIndex

        .Range("B1").Resize(READ_ROW_COUNT, 1) = outputArray

    End With
End Sub

我在我这边生成的一些虚拟数据上测试了代码,在我看来,代码应该按照你所描述的执行(对于 A 列中的每个值,我的输出中的 B 列包含 #N/A 或 E 列中的值(如果找到匹配项)。如果还是不行,请告诉我 why/what 错了。

如果将 Excel 用于 PC,请考虑 SQL,因为 Office 应用程序可以连接到 JET/ACE SQL 引擎(Windows .DLL 文件)。本质上,您需要跨列进行条件计算,可以使用 IIF(对应于 ANSI SQL 的 CASE)处理。对于这种基于集合的操作,10,000 条记录非常快 运行。此解决方案不需要循环。

以下假设:

  1. 你 运行 Excel 2007+ 在 PC 上安装了 ODBC/OLEDB 驱动程序。
  2. 数据从 A1 开始,具有命名列。可以根据需要更改范围和字段。调整 SQL 中的列和 sheet 名称,保留括号 []$)
  3. 存在名为 "RESULTS" 的空 sheet。

SQL (嵌入VBA)

SELECT t.*, IIF(t.[TimeValue1] = t.[TimeValue2], t.[TemperatureValue], 0) As NewColumn
FROM [SheetName$] t

VBA

Sub RunSQL()
On Error GoTo ErrHandle
    Dim conn As Object, rst As Object
    Dim strConnection As String, strSQL As String
    Dim i As Integer, fld As Object

    Set conn = CreateObject("ADODB.Connection")
    Set rst = CreateObject("ADODB.Recordset")

    ' ODBC AND OLEDB CONNECTIONS
    '    strConnection = "DRIVER={Microsoft Excel Driver (*.xls, *.xlsx, *.xlsm, *.xlsb)};" _
    '                      & "DBQ=" & ThisWorkbook.FullName & ";"
    strConnection = "Provider=Microsoft.ACE.OLEDB.12.0;" _
                       & "Data Source='" & ThisWorkbook.FullName & "';" _
                       & "Extended Properties=""Excel 12.0 Xml;HDR=YES;"";"

    strSQL = "SELECT t.*, IIF(t.timeValue1 = t.timeValue2, t.Temperaturevalue, 0) As NewColumn" _
             & " FROM [SheetName$] t;"

    ' OPEN CONNECTION
    conn.Open strConnection
    rst.Open strSQL, conn

    With ThisWorkbook.Worksheets("RESULTS")
       ' COLUMNS
       For i = 1 To rst.Fields.Count
          .Cells(1, i) = rst.Fields(i - 1).Name
       Next i 

       ' DATA
      .Range("A2").CopyFromRecordset rst
    End With

    rst.Close: conn.Close
    MsgBox "Successfully ran SQL query!", vbInformation

ExitHandle:
    Set rst = Nothing: Set conn = Nothing
    Exit Sub

ErrHandle:
    MsgBox Err.Number & " - " & Err.Description, vbCritical, "RUNTIME ERROR"
    Resume ExitHandle
End Sub