vba 用于比较和复制粘贴超过 10,000 个值的高效循环
vba efficient loop to compare and do copy and paste for more than 10,000 values
我想比较 2 个时间值,如果它们匹配,则粘贴当时的温度值,如果在特定时间点缺少一个测量值,则分配 0。此代码目前适用于1000 个值(不到 1 分钟),但是 10,000 个值需要一个多小时。如何减少循环时间?
Sub findMatching()
Dim CurrentLine As Integer, CurrentLine2 As Integer, CurrentLine3 As Integer
Dim MaxRows As Integer, MaxRows2 As Integer
MaxRows = 1000
MaxRows2 = 1000
CurrentLine = 1
For CurrentLine = 1 To MaxRows '-- Loop in A column (read data)
For CurrentLine2 = 1 To MaxRows2 '-- Loop in D column (compare data)
If Sheets(1).Cells(CurrentLine, 1) = Sheets(1).Cells(CurrentLine2,4) Then
'-- copying matching data
Sheets(1).Cells(CurrentLine, 2) = Sheets(1).Cells(CurrentLine2, 5)
CurrentLine = CurrentLine + 1
ElseIf Sheets(1).Cells(CurrentLine, 1) <> Sheets(1).Cells(CurrentLine2,4) Then
Sheets(1).Cells(CurrentLine, 2) = 0
End If
Next CurrentLine2
Next CurrentLine
End Sub
下面的代码依赖于您能够访问 Scripting.Dictionary 对象。我使用后期绑定,因此您不需要添加引用。
你说 Range.Resize
害死你了。不太清楚为什么会这样,但我在下面的代码中再次使用了它。如果您有性能问题,请告诉我。
Option Explicit
Private Sub findFirstMatching()
' Declared two constants because OP had done it that way in their post.
' Depending on use case, could get rid of second and just use the one
' But having two allows you to change one without the other.
Const READ_ROW_COUNT As Long = 10000 ' Used for columns A, B
Const COMPARISON_ROW_COUNT As Long = 10000 ' Used for columns D, E
' Change sheet name below to wherever the data is. I assume Sheet1 '
With ThisWorkbook.Worksheets("Sheet1")
Dim columnA() As Variant
columnA = .Range("A1").Resize(READ_ROW_COUNT, 1).Value2
Dim columnD() As Variant
columnD = .Range("D1").Resize(COMPARISON_ROW_COUNT, 1).Value2
Dim columnE() As Variant
columnE = .Range("E1").Resize(COMPARISON_ROW_COUNT, 1).Value2
Dim dict As Object
Set dict = CreateObject("Scripting.Dictionary")
Dim rowIndex As Long
' Fill dictionary (array values as dictionary's keys, row numbers as dictionary's corresponding values)
' If there are duplicates in column D, the dictionary will only contain/return the row number of the FIRST instance/match
For rowIndex = LBound(columnD, 1) To UBound(columnD, 1)
If Not dict.Exists(columnD(rowIndex, 1)) Then
dict.Add columnD(rowIndex, 1), rowIndex
End If
Next rowIndex
Dim outputArray() As Variant
ReDim outputArray(1 To READ_ROW_COUNT, 1 To 1)
Dim rowIndexOfFirstMatch As Long
' Now loop through column A's values and check if it exists in dict
For rowIndex = LBound(columnA, 1) To UBound(columnA, 1)
If dict.Exists(columnA(rowIndex, 1)) Then
rowIndexOfFirstMatch = dict.Item(columnA(rowIndex, 1))
outputArray(rowIndex, 1) = columnE(rowIndexOfFirstMatch, 1)
Else
outputArray(rowIndex, 1) = "#N/A" ' Change to zero if needed.
End If
Next rowIndex
.Range("B1").Resize(READ_ROW_COUNT, 1) = outputArray
End With
End Sub
我在我这边生成的一些虚拟数据上测试了代码,在我看来,代码应该按照你所描述的执行(对于 A 列中的每个值,我的输出中的 B 列包含 #N/A
或 E 列中的值(如果找到匹配项)。如果还是不行,请告诉我 why/what 错了。
如果将 Excel 用于 PC,请考虑 SQL,因为 Office 应用程序可以连接到 JET/ACE SQL 引擎(Windows .DLL 文件)。本质上,您需要跨列进行条件计算,可以使用 IIF
(对应于 ANSI SQL 的 CASE
)处理。对于这种基于集合的操作,10,000 条记录非常快 运行。此解决方案不需要循环。
以下假设:
- 你 运行 Excel 2007+ 在 PC 上安装了 ODBC/OLEDB 驱动程序。
- 数据从
A1
开始,具有命名列。可以根据需要更改范围和字段。调整 SQL 中的列和 sheet 名称,保留括号 []
和 $
)
- 存在名为 "RESULTS" 的空 sheet。
SQL (嵌入VBA)
SELECT t.*, IIF(t.[TimeValue1] = t.[TimeValue2], t.[TemperatureValue], 0) As NewColumn
FROM [SheetName$] t
VBA
Sub RunSQL()
On Error GoTo ErrHandle
Dim conn As Object, rst As Object
Dim strConnection As String, strSQL As String
Dim i As Integer, fld As Object
Set conn = CreateObject("ADODB.Connection")
Set rst = CreateObject("ADODB.Recordset")
' ODBC AND OLEDB CONNECTIONS
' strConnection = "DRIVER={Microsoft Excel Driver (*.xls, *.xlsx, *.xlsm, *.xlsb)};" _
' & "DBQ=" & ThisWorkbook.FullName & ";"
strConnection = "Provider=Microsoft.ACE.OLEDB.12.0;" _
& "Data Source='" & ThisWorkbook.FullName & "';" _
& "Extended Properties=""Excel 12.0 Xml;HDR=YES;"";"
strSQL = "SELECT t.*, IIF(t.timeValue1 = t.timeValue2, t.Temperaturevalue, 0) As NewColumn" _
& " FROM [SheetName$] t;"
' OPEN CONNECTION
conn.Open strConnection
rst.Open strSQL, conn
With ThisWorkbook.Worksheets("RESULTS")
' COLUMNS
For i = 1 To rst.Fields.Count
.Cells(1, i) = rst.Fields(i - 1).Name
Next i
' DATA
.Range("A2").CopyFromRecordset rst
End With
rst.Close: conn.Close
MsgBox "Successfully ran SQL query!", vbInformation
ExitHandle:
Set rst = Nothing: Set conn = Nothing
Exit Sub
ErrHandle:
MsgBox Err.Number & " - " & Err.Description, vbCritical, "RUNTIME ERROR"
Resume ExitHandle
End Sub
我想比较 2 个时间值,如果它们匹配,则粘贴当时的温度值,如果在特定时间点缺少一个测量值,则分配 0。此代码目前适用于1000 个值(不到 1 分钟),但是 10,000 个值需要一个多小时。如何减少循环时间?
Sub findMatching()
Dim CurrentLine As Integer, CurrentLine2 As Integer, CurrentLine3 As Integer
Dim MaxRows As Integer, MaxRows2 As Integer
MaxRows = 1000
MaxRows2 = 1000
CurrentLine = 1
For CurrentLine = 1 To MaxRows '-- Loop in A column (read data)
For CurrentLine2 = 1 To MaxRows2 '-- Loop in D column (compare data)
If Sheets(1).Cells(CurrentLine, 1) = Sheets(1).Cells(CurrentLine2,4) Then
'-- copying matching data
Sheets(1).Cells(CurrentLine, 2) = Sheets(1).Cells(CurrentLine2, 5)
CurrentLine = CurrentLine + 1
ElseIf Sheets(1).Cells(CurrentLine, 1) <> Sheets(1).Cells(CurrentLine2,4) Then
Sheets(1).Cells(CurrentLine, 2) = 0
End If
Next CurrentLine2
Next CurrentLine
End Sub
下面的代码依赖于您能够访问 Scripting.Dictionary 对象。我使用后期绑定,因此您不需要添加引用。
你说 Range.Resize
害死你了。不太清楚为什么会这样,但我在下面的代码中再次使用了它。如果您有性能问题,请告诉我。
Option Explicit
Private Sub findFirstMatching()
' Declared two constants because OP had done it that way in their post.
' Depending on use case, could get rid of second and just use the one
' But having two allows you to change one without the other.
Const READ_ROW_COUNT As Long = 10000 ' Used for columns A, B
Const COMPARISON_ROW_COUNT As Long = 10000 ' Used for columns D, E
' Change sheet name below to wherever the data is. I assume Sheet1 '
With ThisWorkbook.Worksheets("Sheet1")
Dim columnA() As Variant
columnA = .Range("A1").Resize(READ_ROW_COUNT, 1).Value2
Dim columnD() As Variant
columnD = .Range("D1").Resize(COMPARISON_ROW_COUNT, 1).Value2
Dim columnE() As Variant
columnE = .Range("E1").Resize(COMPARISON_ROW_COUNT, 1).Value2
Dim dict As Object
Set dict = CreateObject("Scripting.Dictionary")
Dim rowIndex As Long
' Fill dictionary (array values as dictionary's keys, row numbers as dictionary's corresponding values)
' If there are duplicates in column D, the dictionary will only contain/return the row number of the FIRST instance/match
For rowIndex = LBound(columnD, 1) To UBound(columnD, 1)
If Not dict.Exists(columnD(rowIndex, 1)) Then
dict.Add columnD(rowIndex, 1), rowIndex
End If
Next rowIndex
Dim outputArray() As Variant
ReDim outputArray(1 To READ_ROW_COUNT, 1 To 1)
Dim rowIndexOfFirstMatch As Long
' Now loop through column A's values and check if it exists in dict
For rowIndex = LBound(columnA, 1) To UBound(columnA, 1)
If dict.Exists(columnA(rowIndex, 1)) Then
rowIndexOfFirstMatch = dict.Item(columnA(rowIndex, 1))
outputArray(rowIndex, 1) = columnE(rowIndexOfFirstMatch, 1)
Else
outputArray(rowIndex, 1) = "#N/A" ' Change to zero if needed.
End If
Next rowIndex
.Range("B1").Resize(READ_ROW_COUNT, 1) = outputArray
End With
End Sub
我在我这边生成的一些虚拟数据上测试了代码,在我看来,代码应该按照你所描述的执行(对于 A 列中的每个值,我的输出中的 B 列包含 #N/A
或 E 列中的值(如果找到匹配项)。如果还是不行,请告诉我 why/what 错了。
如果将 Excel 用于 PC,请考虑 SQL,因为 Office 应用程序可以连接到 JET/ACE SQL 引擎(Windows .DLL 文件)。本质上,您需要跨列进行条件计算,可以使用 IIF
(对应于 ANSI SQL 的 CASE
)处理。对于这种基于集合的操作,10,000 条记录非常快 运行。此解决方案不需要循环。
以下假设:
- 你 运行 Excel 2007+ 在 PC 上安装了 ODBC/OLEDB 驱动程序。
- 数据从
A1
开始,具有命名列。可以根据需要更改范围和字段。调整 SQL 中的列和 sheet 名称,保留括号[]
和$
) - 存在名为 "RESULTS" 的空 sheet。
SQL (嵌入VBA)
SELECT t.*, IIF(t.[TimeValue1] = t.[TimeValue2], t.[TemperatureValue], 0) As NewColumn
FROM [SheetName$] t
VBA
Sub RunSQL()
On Error GoTo ErrHandle
Dim conn As Object, rst As Object
Dim strConnection As String, strSQL As String
Dim i As Integer, fld As Object
Set conn = CreateObject("ADODB.Connection")
Set rst = CreateObject("ADODB.Recordset")
' ODBC AND OLEDB CONNECTIONS
' strConnection = "DRIVER={Microsoft Excel Driver (*.xls, *.xlsx, *.xlsm, *.xlsb)};" _
' & "DBQ=" & ThisWorkbook.FullName & ";"
strConnection = "Provider=Microsoft.ACE.OLEDB.12.0;" _
& "Data Source='" & ThisWorkbook.FullName & "';" _
& "Extended Properties=""Excel 12.0 Xml;HDR=YES;"";"
strSQL = "SELECT t.*, IIF(t.timeValue1 = t.timeValue2, t.Temperaturevalue, 0) As NewColumn" _
& " FROM [SheetName$] t;"
' OPEN CONNECTION
conn.Open strConnection
rst.Open strSQL, conn
With ThisWorkbook.Worksheets("RESULTS")
' COLUMNS
For i = 1 To rst.Fields.Count
.Cells(1, i) = rst.Fields(i - 1).Name
Next i
' DATA
.Range("A2").CopyFromRecordset rst
End With
rst.Close: conn.Close
MsgBox "Successfully ran SQL query!", vbInformation
ExitHandle:
Set rst = Nothing: Set conn = Nothing
Exit Sub
ErrHandle:
MsgBox Err.Number & " - " & Err.Description, vbCritical, "RUNTIME ERROR"
Resume ExitHandle
End Sub