更新大型数组中与辅助数组匹配的特定值(如 vlookup)

Update specific values within a large array where they match a secondary array (like a vlookup)

我的任务是在 Excel 中编写一个 "simple" 宏来执行以下任务,我可以相对轻松地使用 VLOOKUP 完成这些任务,但由于数据的大小,在某些情况下可能需要一天完全锁死电脑。

我有一个包含各种 headers 的 .csv,我需要对其进行查询以查找并替换那些 headers 以“_flag”结尾的列中的值。

"flags"(用于识别数据的置信度)已更新,所有旧数据都需要相应地刷新。

我有一个我称之为 "FlagMap" 的东西,它列出了旧标志以及相应的新标志应该作为单独的 table。

通常在 Excel 中,我会在另一个选项卡上有一个工作表(命名范围),其中包含 FlagMap 和 CSV,然后在要更改的列旁边手动放置一个虚拟列,运行一个 VLOOKUP - 已经完成了一些较小的数据集,但考虑到我必须做的事情,我会在完成之前退休(有些有 20 列 "flags"(总共 40 列)和多达 70,000 行)。

我已经制定了流程,但我真的被编码困住了。

整体循环 查找标题以“_flag”结尾的列 If True Loop通过vlookup样式代码来改变 基于 if cell in source data = a flag in FlagMap(COL1) replace it with FlagMap(COL2) value 结束变化循环 下一栏

其中一个关键是我可以替换原始源数据中的空白(标志)(通过在 FlagMap 数组中指定空白条目和相应的标志)。

我的代码完全是一团糟,因为我试图一点一点地构建它(通过记录宏等。正在考虑在列上使用过滤器)。

代码如下:虽然它似乎什么也没做; brain-hurts..... 子 FlagUpdate_v00()

Dim wsDATA As Worksheet 'original data to be updated
Dim wsFLAG As Worksheet 'Flag mapping lookup sheet
Dim rFLAGMAP As Range 'Flag mapping range n ROWs & 2 COLs(no headers)
Dim rDATA As Range 'Data to update
Dim i As Long, j As Long, n As Long 'Loop counters
Dim FlagLRow As Long, DataLRow As Long 'last row numbers of corresponding data tables
Dim FlagArray, DataArray, TempArray() As String  'lookup values

Set wsDATA = ThisWorkbook.Sheets("TEST") 'assigns location of data
Set wsFLAG = ThisWorkbook.Sheets("FlagMap") 'assigns location of flags

'lastrow = Cells.SpecialCells(xlCellTypeLastCell).Row
'FlagLRow = wsFLAG.Cells(Rows.Count, 1).End(x1Up).Row 'Sets number of rows upper bound of loop *NOT WORKING
'DataLRow = wsDATA.Cells(Rows.Count, 1).End(x1Up).Row 'Sets number of rows upper bound of loop *NOT WORKING
FlagLRow = wsFLAG.Cells.SpecialCells(xlCellTypeLastCell).Row 'Sets number of rows upper bound of loop
DataLRow = wsDATA.Cells.SpecialCells(xlCellTypeLastCell).Row 'Sets number of rows upper bound of loop

Set rFLAGMAP = wsFLAG.Range("A2:A" & FlagLRow) 'sets range of flags to avoiding column header

'this will need to be within a loop to only select range of those columns ending "_Flag"
'Set rDATA = wsDATA.Range(ColLoopRef & "2:A" & DataLRow) 'set range to update
Set rDATA = wsDATA.Range("F2:F" & DataLRow) 'test data only looking at one small column of data

FlagArray = rFLAGMAP.Value 'set contents of array
DataArray = rDATA.Value 'set contents of array

'Loop to replace
 For i = LBound(DataArray) To UBound(DataArray) 'start end values of i loop (Original Flag)
    For j = LBound(FlagArray) To UBound(FlagArray) 'start end values of j loop (FlagMap)
        If DataArray(i, 1) = FlagArray(j, 1) Then 'if Original Data Flag matches the value in the FlagMap
            Set DataArray(i, 1) = FlagArray(j, 2) 'replace it with that from column 2
        End If 'all flags should be Mapped hence always finds one
        'only issue may be blanks!!
    Next j 'loop through the MAPPEDFLAG list (ie. a vlookup) 1st
Next i 'move on to the next DATA flag to be re-flagged


End Sub

感谢@stucharo 突出显示了我的确切示例,如下所示:

CSV 格式的当前数据集

   h1        h2     h3_flag      h4     h5_flag      h6
-------------------------------------------------------------
   val1      val2      val3      val4      val5      val6
   val2      val3      val4      val5      val6      val1
   val3      val4      val5      val6      val1      val2
   val4      val5      val6      val1      val2      val3
   val5      val6      val1      val2      val3      val4
   val6      val1      val2      val3      val4      val5

标志映射table

 flag      alt.   
--------------------
val1      vala
val2      valb
val3      valc
val4      vald
val5      vale
val6      valf

输出

    h1        h2     h3_flag      h4     h5_flag      h6
-------------------------------------------------------------
   val1      val2      valc      val4      vale      val6
   val2      val3      vald      val5      valf      val1
   val3      val4      vale      val6      vala      val2
   val4      val5      valf      val1      valb      val3
   val5      val6      vala      val2      valc      val4
   val6      val1      valb      val3      vald      val5

这是我为类似的东西开发的一段代码,但我做了一些调整以适应您的问题。它可能需要一些整理,因为我无法访问您的文件来测试它:

Sub Flags()

'Assuming your flag map is in a 2 column range in a worksheet
'you can create an array then add it to a collection
'so you can access each entry in col2 by the key from col1
Dim fMap() As Variant
Dim FlagMap As New Collection

fMap = Range("A1:B6") 'range where FlagMap stored

Dim i As Integer        'We can reuse this in other counters....
For i = LBound(fMap) To UBound(fMap)
    FlagMap.Add CStr(fMap(i, 2)), CStr(fMap(i, 1)) 'FlagMap range has "Key" field 1st - assign to Collection correctly
Next i

'Create a FileSystemObject to work with your csv's
Dim fso As Object
Set fso = CreateObject("Scripting.FileSystemObject")

'Create a file object to contain your original csv
Dim ifil As Object
Set ifil = fso.GetFile("C:\completepath\myinfile.csv")

'Create a textstream from the input csv file
Dim its As Object
Set its = ifil.OpenAsTextStream()

'We can also output this back into a new csv by
'creating  a textstream to let us write to a new file
Dim ots As Object
Set ots = fso.CreateTextFile("C:\completepath\myoutfile.csv")

'We'll also need a string variable to send to this stream
Dim oStr As String

'Create a counter to keep track of lines through your
'input textstream
Dim lineCounter As Integer
lineCounter = 0

'You can split each line into an array delimited by "," using
'the "Split" function so we need an array variable to hold this
Dim lineArray() As String

'Because this steps through row wise, we also need an
'collection to hold references to the columns of interest
'and a variant to access them again
Dim cols As New Collection
Dim col as Variant

'Continue a loop until you reach the end of your textstream
'i.e. the end of your input csv file
Do While Not its.AtEndOfStream

    'Increment your line counter
    lineCounter = lineCounter + 1
    lineArray = Split(its.ReadLine, ",")

    'If you are on the first row, find all the interesting columns
    If lineCounter = 1 Then ' assuming the headings are on row 1
        For i = LBound(lineArray) To UBound(lineArray)
            'If we find a heading ending in "_flag" then store its
            'column number in the cols collection.
            If Right(lineArray(i), 5) = "_flag" Then
                cols.Add i 'Add ref of where heading "_flag" is to limit lookup just to those areas and not the whole dataset
            End If
        Next i

    'If it's not the first row then process the data as normal
    Else
        'Just look at the columns of interest
        For Each col In cols
            'Replace the value in that column with the corresponding
            'value in the FlagMap
            lineArray(CInt(col)) = FlagMap(lineArray(CInt(col))) 'this is the lookup in the collection based on the "key"
        Next col
    End If

    'Now print the line back out to your new csv file
    oStr = ""
    For i = LBound(lineArray) To UBound(lineArray)
        oStr = oStr + lineArray(i) & ","
    Next i
    oStr = Left(oStr, Len(oStr) - 1)

    ots.WriteLine (oStr)

Loop

'Close the textstreams
its.Close
ots.Close

End Sub

TextStream will let you read and write .csv files in VBA without opening them in excel. Collections 允许您发展 'key':{value} 关系,这听起来有点像您的 FlagMap。