通过 Powershell 查找给定文件的完整路径的最快方法?

Fastest way to find a full path of a given file via Powershell?

我需要编写一个 Powershell 片段,以尽快找到整个分区上给定文件名的完整路径。

为了更好的比较,我在代码示例中使用了这个全局变量:

$searchDir  = "c:\"
$searchName = "hosts"

我从一个使用 Get-ChildItem 的小片段开始,以获得第一个基线:

"get-ChildItem"
$timer = [System.Diagnostics.Stopwatch]::StartNew()
$result = Get-ChildItem -LiteralPath $searchDir -Filter $searchName -File -Recurse -ea 0
write-host $timer.Elapsed.TotalSeconds "sec."

我的 SSD 上的运行时间是 14,8581609 秒。

接下来,我尝试了 运行 经典的 DIR 命令以查看改进:

"dir"
$timer = [System.Diagnostics.Stopwatch]::StartNew()
$result = &cmd /c dir "$searchDir$searchName" /b /s /a-d
$timer.Stop()
write-host $timer.Elapsed.TotalSeconds "sec."

这在 13,4713342 秒内完成。 - 还不错,但我们可以更快地完成吗?

在第三次迭代中,我使用 ROBOCOPY 测试了相同的任务。这里的代码示例:

"robocopy"
$timer = [System.Diagnostics.Stopwatch]::StartNew()

$roboDir = [System.IO.Path]::GetDirectoryName($searchDir)
if (!$roboDir) {$roboDir = $searchDir.Substring(0,2)}

$info = [System.Diagnostics.ProcessStartInfo]::new()
$info.FileName = "$env:windir\system32\robocopy.exe"
$info.RedirectStandardOutput = $true
$info.Arguments = " /l ""$roboDir"" null ""$searchName"" /bytes /njh /njs /np /nc /ndl /xjd /mt /s"
$info.UseShellExecute = $false
$info.CreateNoWindow = $true
$info.WorkingDirectory = $searchDir

$process = [System.Diagnostics.Process]::new()
$process.StartInfo = $info
[void]$process.Start()
$process.WaitForExit()

$timer.Stop()
write-host $timer.Elapsed.TotalSeconds "sec."

或更短的版本(基于好的评论):

"robocopy v2"
$timer = [System.Diagnostics.Stopwatch]::StartNew()
$fileList = (&cmd /c pushd $searchDir `& robocopy /l "$searchDir" null "$searchName" /ns /njh /njs /np /nc /ndl /xjd /mt /s).trim() -ne ''
$timer.Stop()
write-host $timer.Elapsed.TotalSeconds "sec."

它比 DIR 快吗?是的,一点没错!运行时间现在下降到 3,2685551 秒。 这一巨大改进的主要原因是,ROBOCOPY 在多个并行实例中以多任务模式与 /mt-swich 一起运行。但即使没有这个涡轮开关也比 DIR 快。

任务完成?不是真的 - 因为我的任务是创建一个 powershell 脚本来尽可能快地搜索文件,但是调用 ROBOCOPY 有点作弊。

接下来,我想看看,使用[System.IO.Directory]我们能多快。第一次尝试是使用 getFiles 和 getDirectory 调用。这是我的代码:

"GetFiles"
$timer = [System.Diagnostics.Stopwatch]::StartNew()

$fileList = [System.Collections.Generic.List[string]]::new()
$dirList = [System.Collections.Generic.Queue[string]]::new()
$dirList.Enqueue($searchDir)
while ($dirList.Count -ne 0) {
    $dir = $dirList.Dequeue()
    try {
        $files = [System.IO.Directory]::GetFiles($dir, $searchName)
        if ($files) {$fileList.addRange($file)}
        foreach($subdir in [System.IO.Directory]::GetDirectories($dir)) {
            $dirList.Enqueue($subDir)
        }
    } catch {}
}
$timer.Stop()
write-host $timer.Elapsed.TotalSeconds "sec."

这次的运行时间是 19,3393872 秒。迄今为止最慢的代码。我们能做得更好吗?现在这里有一个带有枚举调用的代码片段用于比较:

"EnumerateFiles"
$timer = [System.Diagnostics.Stopwatch]::StartNew()

$fileList = [System.Collections.Generic.List[string]]::new()
$dirList = [System.Collections.Generic.Queue[string]]::new()
$dirList.Enqueue($searchDir)
while ($dirList.Count -ne 0) {
    $dir = $dirList.Dequeue()
    try {
        foreach($file in [System.IO.Directory]::EnumerateFiles($dir, $searchName)) {
            $fileList.add($file)
        }
        foreach ($subdir in [System.IO.Directory]::EnumerateDirectories($dir)) {
            $dirList.Enqueue($subDir)
        }
    } catch {}
}

$timer.Stop()
write-host $timer.Elapsed.TotalSeconds "sec."

运行时间为 19,2068545 秒,速度稍快

现在让我们看看是否可以通过从 Kernel32 直接调用 WinAPI 来加快速度。 这是代码。让我们看看,这次有多快:

"WinAPI"
add-type -Name FileSearch -Namespace Win32 -MemberDefinition @"
    public struct WIN32_FIND_DATA {
        public uint dwFileAttributes;
        public System.Runtime.InteropServices.ComTypes.FILETIME ftCreationTime;
        public System.Runtime.InteropServices.ComTypes.FILETIME ftLastAccessTime;
        public System.Runtime.InteropServices.ComTypes.FILETIME ftLastWriteTime;
        public uint nFileSizeHigh;
        public uint nFileSizeLow;
        public uint dwReserved0;
        public uint dwReserved1;
        [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 260)]
        public string cFileName;
        [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 14)]
        public string cAlternateFileName;
    }

    [DllImport("kernel32.dll", SetLastError = true, CharSet = CharSet.Ansi)]
    public static extern IntPtr FindFirstFile
      (string lpFileName, out WIN32_FIND_DATA lpFindFileData);

    [DllImport("kernel32.dll", SetLastError = true, CharSet = CharSet.Ansi)]
    public static extern bool FindNextFile
      (IntPtr hFindFile, out WIN32_FIND_DATA lpFindFileData);

    [DllImport("kernel32.dll", SetLastError = true, CharSet = CharSet.Ansi)]
    public static extern bool FindClose(IntPtr hFindFile);
"@

$rootDir = 'c:'
$searchFile = "hosts"

$fileList = [System.Collections.Generic.List[string]]::new()
$dirList = [System.Collections.Generic.Queue[string]]::new()
$dirList.Enqueue($rootDir)
$timer = [System.Diagnostics.Stopwatch]::StartNew()

$fileData = new-object Win32.FileSearch+WIN32_FIND_DATA
while ($dirList.Count -ne 0) {
    $dir = $dirList.Dequeue()
    $handle = [Win32.FileSearch]::FindFirstFile("$dir\*", [ref]$fileData)
    [void][Win32.FileSearch]::FindNextFile($handle, [ref]$fileData)
    while ([Win32.FileSearch]::FindNextFile($handle, [ref]$fileData)) {
        if ($fileData.dwFileAttributes -band 0x10) {
            $fullName = [string]::Join('\', $dir, $fileData.cFileName)
            $dirList.Enqueue($fullName)
        } elseif ($fileData.cFileName -eq $searchFile) {
            $fullName = [string]::Join('\', $dir, $fileData.cFileName)
            $fileList.Add($fullName)
        }
    }
    [void][Win32.FileSearch]::FindClose($handle)
}

$timer.Stop()
write-host $timer.Elapsed.TotalSeconds "sec."

对我来说,这种方法的结果是一个非常负面的惊喜。运行时间为 17,499286 秒。 这比 System.IO 调用快,但仍然比简单的 Get-ChildItem 慢。

但是-仍然有希望接近ROBOCOPY的超快结果! 对于 Get-ChildItem 我们不能让调用在多任务模式下执行,但是对于例如Kernel32 调用我们可以选择将其设为递归函数,即通过嵌入式 C# 代码在 PARALLEL foreach 循环中对所有子文件夹的每次迭代进行调用。但是要怎么做呢?

有人知道如何更改最后的代码片段以使用 parallel.foreach 吗? 即使结果可能不像 ROBOCOPY 那样快,我也想 post 这里的这种方法为这个经典的“文件搜索”主题提供完整的故事书。

请让我知道并行代码部分是如何做的。

更新: 为了完整起见,我在 Powershell 7 上添加了 GetFiles 代码 运行 的代码和运行时,具有更智能的访问处理:

"GetFiles PS7"
$timer = [System.Diagnostics.Stopwatch]::StartNew()
$fileList = [system.IO.Directory]::GetFiles(
  $searchDir, 
  $searchFile,
  [IO.EnumerationOptions] @{AttributesToSkip = 'ReparsePoint'; RecurseSubdirectories = $true; IgnoreInaccessible = $true}
)
$timer.Stop()
write-host $timer.Elapsed.TotalSeconds "sec."

我系统上的运行时间是 9,150673 秒。 - 比 DIR 快,但仍然比在 8 核上进行多任务处理的 robocopy 慢。

更新#2: 在试用了新的 PS7 功能之后,我想出了这个代码片段,它使用了我的第一个(但很丑?)并行代码方法:

"WinAPI PS7 parallel"
$searchDir  = "c:\"
$searchFile = "hosts"

add-type -Name FileSearch -Namespace Win32 -MemberDefinition @"
    public struct WIN32_FIND_DATA {
        public uint dwFileAttributes;
        public System.Runtime.InteropServices.ComTypes.FILETIME ftCreationTime;
        public System.Runtime.InteropServices.ComTypes.FILETIME ftLastAccessTime;
        public System.Runtime.InteropServices.ComTypes.FILETIME ftLastWriteTime;
        public uint nFileSizeHigh;
        public uint nFileSizeLow;
        public uint dwReserved0;
        public uint dwReserved1;
        [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 260)]
        public string cFileName;
        [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 14)]
        public string cAlternateFileName;
    }

    [DllImport("kernel32.dll", SetLastError = true, CharSet = CharSet.Ansi)]
    public static extern IntPtr FindFirstFile
      (string lpFileName, out WIN32_FIND_DATA lpFindFileData);

    [DllImport("kernel32.dll", SetLastError = true, CharSet = CharSet.Ansi)]
    public static extern bool FindNextFile
      (IntPtr hFindFile, out WIN32_FIND_DATA lpFindFileData);

    [DllImport("kernel32.dll", SetLastError = true, CharSet = CharSet.Ansi)]
    public static extern bool FindClose(IntPtr hFindFile);
"@

$rootDir = $searchDir -replace "\$"
$maxRunSpaces = [int]$env:NUMBER_OF_PROCESSORS
$fileList = [System.Collections.Concurrent.BlockingCollection[string]]::new()
$dirList = [System.Collections.Concurrent.BlockingCollection[string]]::new()
$dirList.Add($rootDir)
$timer = [System.Diagnostics.Stopwatch]::StartNew()

(1..$maxRunSpaces) | ForEach-Object -ThrottleLimit $maxRunSpaces -Parallel {
    $dirList = $using:dirList
    $fileList = $using:fileList
    $fileData = new-object Win32.FileSearch+WIN32_FIND_DATA
    $dir = $null
    if ($_ -eq 1) {$delay = 0} else {$delay = 50}
    if ($dirList.TryTake([ref]$dir, $delay)) {
        do {
            $handle = [Win32.FileSearch]::FindFirstFile("$dir\*", [ref]$fileData)
            [void][Win32.FileSearch]::FindNextFile($handle, [ref]$fileData)
            while ([Win32.FileSearch]::FindNextFile($handle, [ref]$fileData)) {
                if ($fileData.dwFileAttributes -band 0x10) {
                    $fullName = [string]::Join('\', $dir, $fileData.cFileName)
                    $dirList.Add($fullName)
                } elseif ($fileData.cFileName -eq $using:searchFile) {
                    $fullName = [string]::Join('\', $dir, $fileData.cFileName)
                    $fileList.Add($fullName)
                }
            }
            [void][Win32.FileSearch]::FindClose($handle)
        } until (!$dirList.TryTake([ref]$dir))
    }
}

$timer.Stop()
write-host $timer.Elapsed.TotalSeconds "sec."

运行时间现在非常接近 robocopy 时间。实际上是 4,0809719 秒

还不错,但我仍在寻找一种通过嵌入式 C# 代码使用 parallel.foreach 方法的解决方案,以使其也适用于 Powershell v5。

更新#3: 现在这是我在并行运行空间中的 Powershell 5 运行 的最终代码:

$searchDir  = "c:\"
$searchFile = "hosts"

"WinAPI parallel"
add-type -Name FileSearch -Namespace Win32 -MemberDefinition @"
    public struct WIN32_FIND_DATA {
        public uint dwFileAttributes;
        public System.Runtime.InteropServices.ComTypes.FILETIME ftCreationTime;
        public System.Runtime.InteropServices.ComTypes.FILETIME ftLastAccessTime;
        public System.Runtime.InteropServices.ComTypes.FILETIME ftLastWriteTime;
        public uint nFileSizeHigh;
        public uint nFileSizeLow;
        public uint dwReserved0;
        public uint dwReserved1;
        [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 260)]
        public string cFileName;
        [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 14)]
        public string cAlternateFileName;
    }

    [DllImport("kernel32.dll", SetLastError = true, CharSet = CharSet.Ansi)]
    public static extern IntPtr FindFirstFile
      (string lpFileName, out WIN32_FIND_DATA lpFindFileData);

    [DllImport("kernel32.dll", SetLastError = true, CharSet = CharSet.Ansi)]
    public static extern bool FindNextFile
      (IntPtr hFindFile, out WIN32_FIND_DATA lpFindFileData);

    [DllImport("kernel32.dll", SetLastError = true, CharSet = CharSet.Ansi)]
    public static extern bool FindClose(IntPtr hFindFile);
"@

$rootDir = $searchDir -replace "\$"
$maxRunSpaces = [int]$env:NUMBER_OF_PROCESSORS
$fileList = [System.Collections.Concurrent.BlockingCollection[string]]::new()
$dirList = [System.Collections.Concurrent.BlockingCollection[string]]::new()
$dirList.Add($rootDir)
$timer = [System.Diagnostics.Stopwatch]::StartNew()

$runSpaceList = [System.Collections.Generic.List[PSObject]]::new()
$pool = [RunSpaceFactory]::CreateRunspacePool(1, $maxRunSpaces)
$pool.Open()

foreach ($id in 1..$maxRunSpaces) { 
    $runSpace = [Powershell]::Create()
    $runSpace.RunspacePool = $pool
    [void]$runSpace.AddScript({
        Param (
            [string]$searchFile,
            [System.Collections.Concurrent.BlockingCollection[string]]$dirList,
            [System.Collections.Concurrent.BlockingCollection[string]]$fileList
        )
        $fileData = new-object Win32.FileSearch+WIN32_FIND_DATA
        $dir = $null
        if ($id -eq 1) {$delay = 0} else {$delay = 50}
        if ($dirList.TryTake([ref]$dir, $delay)) {
            do {
                $handle = [Win32.FileSearch]::FindFirstFile("$dir\*", [ref]$fileData)
                [void][Win32.FileSearch]::FindNextFile($handle, [ref]$fileData)
                while ([Win32.FileSearch]::FindNextFile($handle, [ref]$fileData)) {
                    if ($fileData.dwFileAttributes -band 0x10) {
                        $fullName = [string]::Join('\', $dir, $fileData.cFileName)
                        $dirList.Add($fullName)
                    } elseif ($fileData.cFileName -like $searchFile) {
                        $fullName = [string]::Join('\', $dir, $fileData.cFileName)
                        $fileList.Add($fullName)
                    }
                }
                [void][Win32.FileSearch]::FindClose($handle)
            } until (!$dirList.TryTake([ref]$dir))
        }
    })
    [void]$runSpace.addArgument($searchFile)
    [void]$runSpace.addArgument($dirList)
    [void]$runSpace.addArgument($fileList)
    $status = $runSpace.BeginInvoke()
    $runSpaceList.Add([PSCustomObject]@{Name = $id; RunSpace = $runSpace; Status = $status})
}

while ($runSpaceList.Status.IsCompleted -notcontains $true) {sleep -Milliseconds 10}
$pool.Close() 
$pool.Dispose()

$timer.Stop()
$fileList
write-host $timer.Elapsed.TotalSeconds "sec."

总运行时间为 4,8586134 秒。比 PS7 版本慢一点,但仍然比任何 DIR 或 Get-ChildItem 变体快得多。 ;-)

最终解决方案: 最后我能够回答我自己的问题。这是最终代码:

"WinAPI parallel.foreach"

add-type -TypeDefinition @"
using System;
using System.IO;
using System.Collections;
using System.Collections.Generic;
using System.Collections.Concurrent;
using System.Runtime.InteropServices;
using System.Threading;
using System.Threading.Tasks;
using System.Text.RegularExpressions;

public class FileSearch {
    public struct WIN32_FIND_DATA {
        public uint dwFileAttributes;
        public System.Runtime.InteropServices.ComTypes.FILETIME ftCreationTime;
        public System.Runtime.InteropServices.ComTypes.FILETIME ftLastAccessTime;
        public System.Runtime.InteropServices.ComTypes.FILETIME ftLastWriteTime;
        public uint nFileSizeHigh;
        public uint nFileSizeLow;
        public uint dwReserved0;
        public uint dwReserved1;
        [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 260)]
        public string cFileName;
        [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 14)]
        public string cAlternateFileName;
    }

    [DllImport("kernel32.dll", SetLastError = true, CharSet = CharSet.Ansi)]
    public static extern IntPtr FindFirstFile
      (string lpFileName, out WIN32_FIND_DATA lpFindFileData);

    [DllImport("kernel32.dll", SetLastError = true, CharSet = CharSet.Ansi)]
    public static extern bool FindNextFile
      (IntPtr hFindFile, out WIN32_FIND_DATA lpFindFileData);

    [DllImport("kernel32.dll", SetLastError = true, CharSet = CharSet.Ansi)]
    public static extern bool FindClose(IntPtr hFindFile);

    static IntPtr INVALID_HANDLE_VALUE = new IntPtr(-1);

    public static class Globals {
        public static BlockingCollection<string> resultFileList {get;set;}
    }

    public static BlockingCollection<string> GetTreeFiles(string path, string searchFile) {
        Globals.resultFileList = new BlockingCollection<string>();
        List<string> dirList = new List<string>();
        searchFile = @"^" + searchFile.Replace(@".",@"\.").Replace(@"*",@".*").Replace(@"?",@".") + @"$";
        GetFiles(path, searchFile);
        return Globals.resultFileList;
    }

    static void GetFiles(string path, string searchFile) {
        path = path.EndsWith(@"\") ? path : path + @"\";
        List<string> dirList = new List<string>();
        WIN32_FIND_DATA fileData;
        IntPtr handle = INVALID_HANDLE_VALUE;
        handle = FindFirstFile(path + @"*", out fileData);
        if (handle != INVALID_HANDLE_VALUE) {
            FindNextFile(handle, out fileData);
            while (FindNextFile(handle, out fileData)) {
                if ((fileData.dwFileAttributes & 0x10) > 0) {
                    string fullPath = path + fileData.cFileName;
                    dirList.Add(fullPath);
                } else {
                    if (Regex.IsMatch(fileData.cFileName, searchFile, RegexOptions.IgnoreCase)) {
                        string fullPath = path + fileData.cFileName;
                        Globals.resultFileList.TryAdd(fullPath);
                    }
                }
            }
            FindClose(handle);
            Parallel.ForEach(dirList, (dir) => {
                GetFiles(dir, searchFile);
            });
        }
    }
}
"@

[fileSearch]::GetTreeFiles($searchDir, 'hosts')

并且最终运行时间现在比 robocopy 快 3,2536388 秒。 我还在解决方案中添加了该代码的优化版本。

tl;dr:

这个答案没有尝试解决所问的并行问题,但是:

  • 单个递归 [IO.Directory]::GetFiles() 调用可能足够快,但请注意,如果涉及不可访问的目录,这只是 PowerShell [Core] v6.2+ 中的一个选项:
# PowerShell [Core] v6.2+
[IO.Directory]::GetFiles(
  $searchDir, 
  $searchFile,
  [IO.EnumerationOptions] @{ AttributesToSkip = 'ReparsePoint'; RecurseSubdirectories = $true; IgnoreInaccessible = $true }
)
  • 从实用的角度来说(除了编码练习之外),调用 robocopy 是一种完全合法的方法 - 假设您只需要 运行 Windows - 就像(注意 con 是未使用的 target-directory 参数的虚拟参数)一样简单:
(robocopy $searchDir con $searchFile /l /s /mt /njh /njs /ns /nc /ndl /np).Trim() -ne ''

前面几点:

  • but calling ROBOCOPY is a bit of cheating.

    • 可以说,使用 .NET APIs / WinAPI 调用与调用 RoboCopy 等外部实用程序(例如 robocopy.exe /l ...)一样具有欺骗性。毕竟,调用外部程序是任何 shell 的核心任务,包括 PowerShell(而且 System.Diagnostics.Process nor its PowerShell wrapper, Start-Process 都不需要这样做)。 也就是说,虽然在这种情况下不是问题,但当您调用外部程序时,您确实失去了传递和接收 对象 的能力,并且 in-process 操作通常更快。
  • 为了定时执行命令(测量性能),PowerShell 提供了一个 high-level 包装 System.Diagnostics.Stopwatch: the Measure-Command cmdlet。

  • 这样的性能测量值会波动,因为 PowerShell 作为一种动态解析的语言,会使用大量缓存,这些缓存在首次填充时会产生开销,而且您通常不知道什么时候会发生这种情况 - 请参阅this GitHub issue 了解背景信息。

  • 此外,一个遍历文件系统的long-running命令同时受到其他进程运行ning的干扰,是否file-system 之前的信息已经被缓存了 运行 差别很大.

  • 以下比较使用 higher-level 包装 Measure-ObjectTime-Command function,这使得比较多个命令的相对 运行 时间性能简单。


加速 PowerShell 代码的关键是最小化实际的 PowerShell 代码,并尽可能多地将工作卸载给 .NET 方法调用/(编译的)外部程序。

以下对比表现:

  • Get-ChildItem(只是为了对比,我们知道它太慢了)

  • robocopy.exe

  • System.IO.Directory.GetFiles() 的单个递归调用,可能 对您的目的足够快,尽管 -线程.

    • 注意:下面的调用使用仅在 .NET Core 2.1+ 中可用的功能,因此在 PowerShell 中有效[核心] 仅限 v6.2+。 此 API 的 .NET Framework 版本不允许忽略 无法访问的 目录(由于缺少权限),如果遇到此类目录,则会导致枚举失败。
$searchDir = 'C:\'                                                                          #'# dummy comment to fix syntax highlighting
$searchFile = 'hosts'

# Define the commands to compare as an array of script blocks.
$cmds = 
  { 
    [IO.Directory]::GetFiles(
      $searchDir, 
      $searchFile,
      [IO.EnumerationOptions] @{ AttributesToSkip = 'ReparsePoint'; RecurseSubdirectories = $true; IgnoreInaccessible = $true }
    )
  },
  {
    (Get-ChildItem -Literalpath $searchDir -File -Recurse -Filter $searchFile -ErrorAction Ignore -Force).FullName
  },
  {
    (robocopy $searchDir con $searchFile /l /s /mt /njh /njs /ns /nc /ndl /np).Trim() -ne ''
  } 

Write-Verbose -vb "Warming up the cache..."
# Run one of the commands up front to level the playing field
# with respect to cached filesystem information.
$null = & $cmds[-1]

# Run the commands and compare their timings.
Time-Command $cmds -Count 1 -OutputToHost -vb

在我的 2 核 Windows 10 VM 运行ning PowerShell Core 7.1.0-preview.7 上,我得到以下结果;这些数字因许多因素(不仅仅是文件数量)而异,但应该提供相对性能的一般意义(第 Factor 列)。

请注意,由于 file-system 缓存是有意预先预热的,因此与没有缓存信息的 运行 相比,给定机器的数字将过于乐观。

如您所见,在这种情况下,PowerShell [Core] [System.IO.Directory]::GetFiles() 调用实际上优于 multi-threaded robocopy 调用。

VERBOSE: Warming up the cache...
VERBOSE: Starting 1 run(s) of:
    [IO.Directory]::GetFiles(
      $searchDir,
      $searchFile,
      [IO.EnumerationOptions] @{ AttributesToSkip = 'ReparsePoint'; RecurseSubdirectories = $true; IgnoreInaccessible = $true }
    )
  ...
C:\Program Files\Git\etc\hosts
C:\Windows\WinSxS\amd64_microsoft-windows-w..ucture-other-minwin_31bf3856ad364e35_10.0.18362.1_none_079d0d71e24a6112\hosts
C:\Windows\System32\drivers\etc\hosts
C:\Users\jdoe\AppData\Local\Packages\CanonicalGroupLimited.Ubuntu18.04onWindows_79rhkp1fndgsc\LocalState\rootfs\etc\hosts
VERBOSE: Starting 1 run(s) of:
    (Get-ChildItem -Literalpath $searchDir -File -Recurse -Filter $searchFile -ErrorAction Ignore -Force).FullName
  ...
C:\Program Files\Git\etc\hosts
C:\Users\jdoe\AppData\Local\Packages\CanonicalGroupLimited.Ubuntu18.04onWindows_79rhkp1fndgsc\LocalState\rootfs\etc\hosts
C:\Windows\System32\drivers\etc\hosts
C:\Windows\WinSxS\amd64_microsoft-windows-w..ucture-other-minwin_31bf3856ad364e35_10.0.18362.1_none_079d0d71e24a6112\hosts
VERBOSE: Starting 1 run(s) of:
    (robocopy $searchDir con $searchFile /l /s /mt /njh /njs /ns /nc /ndl /np).Trim() -ne ''
  ...
C:\Program Files\Git\etc\hosts
C:\Windows\WinSxS\amd64_microsoft-windows-w..ucture-other-minwin_31bf3856ad364e35_10.0.18362.1_none_079d0d71e24a6112\hosts
C:\Windows\System32\drivers\etc\hosts
C:\Users\jdoe\AppData\Local\Packages\CanonicalGroupLimited.Ubuntu18.04onWindows_79rhkp1fndgsc\LocalState\rootfs\etc\hosts

VERBOSE: Overall time elapsed: 00:01:48.7731236
Factor Secs (1-run avg.) Command
------ ----------------- -------
1.00   22.500            [IO.Directory]::GetFiles(…
1.14   25.602            (robocopy /l $searchDir NUL $searchFile /s /mt /njh /njs /ns /nc /np).Trim() -ne ''
2.69   60.623            (Get-ChildItem -Literalpath $searchDir -File -Recurse -Filter $searchFile -ErrorAction Ignore -Force).FullName

这是我创建的最终代码。运行时间现在是 2,8627695 秒。 将并行性限制为逻辑核心的数量比对所有子目录执行 Parallel.ForEach 提供更好的性能。

您可以 return 将每次命中的完整 FileInfo-Object 放入生成的 BlockingCollection 中,而不是仅 return 文件名。

# powershell-sample to find all "hosts"-files on Partition "c:\"

cls
Remove-Variable * -ea 0
[System.GC]::Collect()
$ErrorActionPreference = "stop"

$searchDir  = "c:\"
$searchFile = "hosts"

add-type -TypeDefinition @"
using System;
using System.IO;
using System.Linq;
using System.Collections.Concurrent;
using System.Runtime.InteropServices;
using System.Threading.Tasks;
using System.Text.RegularExpressions;

public class FileSearch {
    public struct WIN32_FIND_DATA {
        public uint dwFileAttributes;
        public System.Runtime.InteropServices.ComTypes.FILETIME ftCreationTime;
        public System.Runtime.InteropServices.ComTypes.FILETIME ftLastAccessTime;
        public System.Runtime.InteropServices.ComTypes.FILETIME ftLastWriteTime;
        public uint nFileSizeHigh;
        public uint nFileSizeLow;
        public uint dwReserved0;
        public uint dwReserved1;
        [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 260)]
        public string cFileName;
        [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 14)]
        public string cAlternateFileName;
    }

    [DllImport("kernel32.dll", SetLastError = true, CharSet = CharSet.Ansi)]
    static extern IntPtr FindFirstFile
        (string lpFileName, out WIN32_FIND_DATA lpFindFileData);

    [DllImport("kernel32.dll", SetLastError = true, CharSet = CharSet.Ansi)]
    static extern bool FindNextFile
        (IntPtr hFindFile, out WIN32_FIND_DATA lpFindFileData);

    [DllImport("kernel32.dll", SetLastError = true, CharSet = CharSet.Ansi)]
    static extern bool FindClose(IntPtr hFindFile);

    static IntPtr INVALID_HANDLE_VALUE = new IntPtr(-1);
    static BlockingCollection<string> dirList {get;set;}
    static BlockingCollection<string> fileList {get;set;}

    public static BlockingCollection<string> GetFiles(string searchDir, string searchFile) {
        bool isPattern = false;
        if (searchFile.Contains(@"?") | searchFile.Contains(@"*")) {
            searchFile = @"^" + searchFile.Replace(@".",@"\.").Replace(@"*",@".*").Replace(@"?",@".") + @"$";
            isPattern = true;
        }
        fileList = new BlockingCollection<string>();
        dirList = new BlockingCollection<string>();
        dirList.Add(searchDir);
        int[] threads = Enumerable.Range(1,Environment.ProcessorCount).ToArray();
        Parallel.ForEach(threads, (id) => {
            string path;
            IntPtr handle = INVALID_HANDLE_VALUE;
            WIN32_FIND_DATA fileData;
            if (dirList.TryTake(out path, 100)) {
                do {
                    path = path.EndsWith(@"\") ? path : path + @"\";
                    handle = FindFirstFile(path + @"*", out fileData);
                    if (handle != INVALID_HANDLE_VALUE) {
                        FindNextFile(handle, out fileData);
                        while (FindNextFile(handle, out fileData)) {
                            if ((fileData.dwFileAttributes & 0x10) > 0) {
                                string fullPath = path + fileData.cFileName;
                                dirList.TryAdd(fullPath);
                            } else {
                                if (isPattern) {
                                    if (Regex.IsMatch(fileData.cFileName, searchFile, RegexOptions.IgnoreCase)) {
                                        string fullPath = path + fileData.cFileName;
                                        fileList.TryAdd(fullPath);
                                    }
                                } else {
                                    if (fileData.cFileName == searchFile) {
                                        string fullPath = path + fileData.cFileName;
                                        fileList.TryAdd(fullPath);
                                    }
                                }
                            }
                        }
                        FindClose(handle);
                    }
                } while (dirList.TryTake(out path));
            }
        });
        return fileList;
    }
}
"@

$fileList = [fileSearch]::GetFiles($searchDir, $searchFile)
$fileList