Excel VBA - 搜索所有文件夹(向下钻取)的递归文件将结果写入同一数组的效果不如集合

Excel VBA - Recursive files searching all Folders (drill down) writing results to same array doesn't work so well as a collection

这是我的第一个 post - 我希望它是一个好的 :) 对于家庭来说,这是一个小任务,我想要一个文件夹(及其子文件夹)中所有文件路径的数组,但仅适用于 PDF 或我告诉它要过滤的文件类型。

我更喜欢数组(它可以非常快速地写入范围),我知道我可以将我的第一个代码示例从集合转换为数组,但我想学习和理解的逻辑/语法如何实现示例 1 但仅使用数组。

示例 1 有效(我省略了用于 Debug.Print 的其他代码):

Sub GetAllFilePaths(StartFolder As String, Pattern As String, _
             ByRef colFiles As Collection)

    Dim f As String, sf As String, subF As New Collection, S

    If Right(StartFolder, 1) <> "\" Then StartFolder = StartFolder & "\"

    f = Dir(StartFolder & Pattern)
    Do While Len(f) > 0
        colFiles.Add StartFolder & f
        f = Dir()
    Loop

    sf = Dir(StartFolder, vbDirectory)
    Do While Len(sf) > 0
        If sf <> "." And sf <> ".." Then
            If (GetAttr(StartFolder & sf) And vbDirectory) <> 0 Then
                    subF.Add StartFolder & sf
            End If
        End If
        sf = Dir()
    Loop

    For Each S In subF
        GetAllFilePaths CStr(S), Pattern, colFiles
    Next S

End Sub

示例 2 不太有效,它似乎以我想要的方式循环但覆盖了数组而不是添加到它,所以没有得到我知道的所有 PDF 文件都在深层子文件夹中。

我认为这是我处理添加到数组的方式,调整大小以及我在哪个索引处添加新值,我已经看过..到处寻求帮助,甚至在这里 Recursive search of file/folder structure, https://excelvirtuoso.net/2017/02/07/multi-dimensional-arrays/, VBA macro that search for file in multiple subfolders,

我知道这里的逻辑不对,但似乎无法弄明白,请帮忙..

例2代码(我把调用方式写进去,用Debug.Print测试一下):

Option Explicit
Sub GetAllFilePaths(StartFolder As String, Pattern As String, ByRef allFilePaths As Variant, ByRef allFileNames As Variant)
    Dim FNum As Integer
    Dim mainFolder As Object
    Dim pathFile As String
    Dim subFoldersRecurs As New Collection, SubPath
    Dim SubFilePath As String

    If Right(StartFolder, 1) <> "\" Then StartFolder = StartFolder & "\"

    pathFile = Dir(StartFolder & Pattern)
    Do While Len(pathFile) > 0
        FNum = FNum + 1
        ReDim Preserve allFileNames(1 To FNum)
        ReDim Preserve allFilePaths(1 To FNum)
        allFileNames(FNum) = pathFile
        allFilePaths(FNum) = StartFolder & pathFile
        pathFile = Dir()
    Loop


    SubFilePath = Dir(StartFolder, vbDirectory)
    Do While Len(SubFilePath) > 0
        If SubFilePath <> "." And SubFilePath <> ".." Then
            If (GetAttr(StartFolder & SubFilePath) And vbDirectory) <> 0 Then
                subFoldersRecurs.Add StartFolder & SubFilePath
            End If
        End If
        SubFilePath = Dir()
    Loop

    For Each SubPath In subFoldersRecurs
        GetAllFilePaths CStr(SubPath), Pattern, allFilePaths, allFileNames
    Next SubPath

End Sub

Sub PDFfilestoCollall()
Dim pdfFilePaths() As Variant
Dim pdfFileNames() As Variant

Call GetAllFilePaths("C:\Users\adg\Downloads\test folder of files for ingest\", "*.PDF", pdfFilePaths, pdfFileNames)

Dim CollEntry As Variant
For Each CollEntry In pdfFilePaths
    Debug.Print CollEntry

谢谢, 平均总产量

我在这里重构了你的代码。

Sub GetAllFilePaths(ByVal StartFolder As String, ByVal Pattern As String, _
    ByRef arrFiles() As String, Optional ByRef AddToArrayAt As Long = -1)

    Dim f As String
    Dim sf As String
    Dim subF As Collection
    Dim S
    Dim AddedFiles As Boolean

    If Right(StartFolder, 1) <> "\" Then StartFolder = StartFolder & "\"
    If AddToArrayAt < 0 Then AddToArrayAt = LBound(arrFiles)

    f = Dir(StartFolder & Pattern)
    Do While Len(f) > 0
        AddedFiles = True
        If AddToArrayAt > UBound(arrFiles) Then ReDim Preserve arrFiles(LBound(arrFiles) To UBound(arrFiles) + 100)
        arrFiles(AddToArrayAt) = StartFolder & f
        AddToArrayAt = AddToArrayAt + 1
        f = Dir()
    Loop
    If AddedFiles Then ReDim Preserve arrFiles(LBound(arrFiles) To AddToArrayAt - 1)

    Set subF = New Collection
    sf = Dir(StartFolder, vbDirectory)
    Do While Len(sf) > 0
        If sf <> "." And sf <> ".." Then
            If (GetAttr(StartFolder & sf) And vbDirectory) <> 0 Then
                subF.Add StartFolder & sf
            End If
        End If
        sf = Dir()
    Loop

    For Each S In subF
        GetAllFilePaths CStr(S), Pattern, arrFiles, AddToArrayAt
    Next S

End Sub

Sub test()
    Dim pdfFileNames() As String

    ReDim pdfFileNames(1 To 100)
    GetAllFilePaths "C:\Data\", "*.PDF", pdfFileNames

    Dim i As Long
    For i = LBound(pdfFileNames) To UBound(pdfFileNames)
        Debug.Print pdfFileNames(i)
    Next

End Sub

两点注意事项:

  • 我正在 Redim Preserve 以 100 个为单位处理 arrFiles 数组,因为这个操作非常慢
  • 我在内部为文件夹循环保留了一个集合,因为它非常方便并且不会暴露给调用例程
  • 我没有研究过你的Dir,所以我对它们的功效或效率不做任何声明