将 Powershell Start-Process 的结果添加到文件中,而不是用 -RedirectStandardOutput 替换它

Add the result of a Powershell Start-Process to a file instead of replacing it with -RedirectStandardOutput

我在 Powershell 中使用以下命令在后台转换文件,但想将所有结果记录在一个文件中。现在 -RedirectStandardOutput 替换文件每个 运行.

foreach ($l in gc ./files.txt) {Start-Process -FilePath "c:\Program Files (x86)\calibre2\ebook-convert.exe" -Argumentlist "'$l' '$l.epub'" -Wait -WindowStyle Hidden -RedirectStandardOutput log.txt}

我尝试了重定向,但日志为空。 如果可能的话,我想保持单行。

foreach ($l in gc ./files.txt) {Start-Process -FilePath "c:\Program Files (x86)\calibre2\ebook-convert.exe" -Argumentlist "`"$l`" `"$l.epub`"" -Wait -WindowStyle Hidden *> log.txt}

如果你不使用 Start-Process,你可以使用重定向和附加到文件,但是直接调用:

foreach ($l in gc ./files.txt) {& 'C:\Program Files (x86)\calibre2\ebook-convert.exe' "$l" "$l.epub" *>> log.txt}

如果顺序、同步执行是可接受的,您可以简化命令以使用单个[=132] =] 输出 redirection(假设 ebook-convert.exe 是一个 控制台 子系统应用程序,因此 PowerShell 同步执行 (以阻塞方式):

Get-Content ./files.txt | ForEach-Object {
  & 'c:\Program Files (x86)\calibre2\ebook-convert.exe' $_ "$_.epub" 
} *> log.txt

> 之前放置 * 告诉 PowerShell 重定向 所有 输出流,在 的情况下外部程序 表示标准输出和标准错误。

如果要控制字符编码,使用Out-File - which > effectively is an alias for - with its -Encoding parameter; or, preferably, with text output - which external-program output always is in PowerShell - Set-Content。要同时捕获 stderr 输出,请将 *>&1 附加到管道段中的命令 before the Out-File / Set-Content打电话。

请注意,PowerShell 永远不会将 原始输出 从外部程序传递到文件 - 它们首先总是 解码 为 .NET 字符串,基于在 [Console]::OutputEncoding 中存储的编码(默认情况下系统的活动旧版 OEM 代码页),然后 re-encoded 在保存到文件时,使用 file-writing cmdlet 自己的默认值,除非被 -Encoding 覆盖 - 有关详细信息,请参阅


如果你想要异步,并行执行(比如通过Start-Process,默认是异步的),你的最好的选择是:

  • 写入单独的(临时)文件

    • 在每次调用中将不同的输出文件传递给-RedirectStandardOutput/-RedirectStandardError

    • 请注意,如果您想 合并 stdout 和 stderr 输出并将其捕获到同一个文件中,您必须调用 .exe 文件 通过 shell(可能是另一个 PowerShell 实例)并使用 its 重定向功能;对于 PowerShell,它将是 *>log.txt;对于 cmd.exe(如下所示),它将是 > log.txt 2>&1

  • 等待所有启动的进程完成:

    • -PassThru传递给Start-Process并收集返回的process-information对象。

    • 然后使用Wait-Process等待所有进程终止;根据需要使用 -Timeout 参数。

  • 然后合并它们到一个日志文件中。

这是一个实现:

$procsAndLogFiles = 
  Get-Content ./files.txt | ForEach-Object -Begin { $i = 0 } {
    # Create a distinct log file for each process,
    # and return its name along with a process-information object representing
    # each process as a custom object.
    $logFile = 'log{0:000}.txt' -f ++$i
    [pscustomobject] @{
      LogFile = $logFile
      Process = Start-Process -PassThru -WindowStyle Hidden `
                  -FilePath 'cmd.exe' `
                  -Argumentlist "/c `"`"c:\Program Files (x86)\calibre2\ebook-convert.exe`" `"$_`" `"$_.epub`" >`"$logFile`" 2>&1`"" 
    }
  }

# Wait for all processes to terminate.
# Add -Timeout and error handling as needed.
$procsAndLogFiles.Process | Wait-Process

# Merge all log files.
Get-Content -LiteralPath $procsAndLogFiles.LogFile > log.txt

# Clean up.
Remove-Item -LiteralPath $procsAndLogFiles.LogFile

如果你想要throttled并行执行,从而限制后台进程一次可以运行的数量:

# Limit how many background processes may run in parallel at most.
$maxParallelProcesses = 10

# Initialize the log file.
# Use -Force to unconditionally replace an existing file.
New-Item log.txt  

# Initialize the list in which those input files whose conversion
# failed due to timing out are recorded.
$allTimedOutFiles = [System.Collections.Generic.List[string]]::new()

# Process the input files in batches of $maxParallelProcesses
Get-Content -ReadCount $maxParallelProcesses ./files.txt |
  ForEach-Object {

    $i = 0
    $launchInfos = foreach ($file in $_) {
      # Create a distinct log file for each process,
      # and return its name along with the input file name / path, and 
      # a process-information object representing each process, as a custom object.
      $logFile = 'log{0:000}.txt' -f ++$i
      [pscustomobject] @{
        InputFile = $file
        LogFile = $logFile
        Process = Start-Process -PassThru -WindowStyle Hidden `
          -FilePath 'cmd.exe' `
          -ArgumentList "/c `"`"c:\Program Files (x86)\calibre2\ebook-convert.exe`" `"$file`" `"$_.epub`" >`"$file`" 2>&1`"" 
      }
    }

    # Wait for the processes to terminate, with a timeout.
    $launchInfos.Process | Wait-Process -Timeout 30 -ErrorAction SilentlyContinue -ErrorVariable errs

    # If not all processes terminated within the timeout period,
    # forcefully terminate those that didn't.
    if ($errs) {
      $timedOut = $launchInfos | Where-Object { -not $_.Process.HasExited }
      Write-Warning "Conversion of the following input files timed out; the processes will killed:`n$($timedOut.InputFile)"
      $timedOut.Process | Stop-Process -Force
      $allTimedOutFiles.AddRange(@($timedOut.InputFile))
    }

    # Merge all temp. log files and append to the overall log file.
    $tempLogFiles = Get-Content -ErrorAction Ignore -LiteralPath ($launchInfos.LogFile | Sort-Object)
    $tempLogFiles | Get-Content >> log.txt

    # Clean up.
    $tempLogFiles | Remove-Item

  }

# * log.txt now contains all combined logs
# * $allTimedOutFiles now contains all input file names / paths 
#   whose conversion was aborted due to timing out.

请注意,上述节流技术不是最优的,因为每批输入都一起等待,此时下一批开始。一个更好的方法是在一个可用的并行“插槽”启动时立即启动一个新进程,如下一节所示;但是,请注意需要 PowerShell (Core) 7+


PowerShell (Core) 7+: 有效地 限制并行执行,使用 ForEach-Object -Parallel:

PowerShell (Core) 7+ 通过 -Parallel 参数向 ForEach-Object cmdlet 引入了 thread-based 并行性,该参数具有 built-in 节流,默认为最大值默认 5 个线程,但可以通过 -ThrottleLimit 参数显式控制。

这可以实现高效的节流,因为一旦有可用插槽打开,就会启动一个新线程。

以下是一个 self-contained 示例来演示该技术;它适用于 Windows 和 Unix-like 平台:

  • 输入是 9 个整数,转换过程简单地通过休眠 1 到 9 之间的随机秒数来模拟,然后回显输入数字。

  • 对每个子进程应用 6 秒超时,这意味着随机数量的子进程将超时并被杀死。

#requires -Version 7

# Use ForEach-Object -Parallel to launch child processes in parallel,
# limiting the number of parallel threads (from which the child processes are 
# launched) via -ThrottleLimit.
# -AsJob returns a single job whose child jobs track the threads created.
$job = 
 1..9 | ForEach-Object -ThrottleLimit 3 -AsJob -Parallel {
  # Determine a temporary, thread-specific log file name.
  $logFile = 'log_{0:000}.txt' -f $_
  # Pick a radom sleep time that may or may not be smaller than the timeout period.
  $sleepTime = Get-Random -Minimum 1 -Maximum 9
  # Launch the external program asynchronously and save information about
  # the newly launched child process.
  if ($env:OS -eq 'Windows_NT') {
    $ps = Start-Process -PassThru -WindowStyle Hidden cmd.exe "/c `"timeout $sleepTime >NUL & echo $_ >$logFile 2>&1`""
  }
  else { # macOS, Linux
    $ps = Start-Process -PassThru sh "-c `"{ sleep $sleepTime; echo $_; } >$logFile 2>&1`""
  }
  # Wait for the child process to exit within a given timeout period.
  $ps | Wait-Process -Timeout 6 -ErrorAction SilentlyContinue
  # Check if a timout has occurred (implied by the process not having exited yet)
  $timedOut = -not $ps.HasExited
  if ($timedOut) {
    # Note: Only [Console]::WriteLine produces immediate output, directly to the display.
    [Console]::WriteLine("Warning: Conversion timed out for: $_")
    # Kill the timed-out process.
    $ps | Stop-Process -Force
  }
  # Construct and output a custom object that indicates the input at hand,
  # the associated log file, and whether a timeout occurred.
  [pscustomobject] @{
    InputFile = $_
    LogFile = $logFile
    TimedOut = $timedOut
  }
 }

# Wait for all child processes to exit or be killed
$processInfos = $job | Receive-Job -Wait -AutoRemoveJob

# Merge all temporary log files into an overall log file.
$tempLogFiles = Get-Item -ErrorAction Ignore -LiteralPath ($processInfos.LogFile | Sort-Object)
$tempLogFiles | Get-Content > log.txt

# Clean up the temporary log files.
$tempLogFiles | Remove-Item

# To illustrate the results, show the overall log file's content
# and which inputs caused timeouts.
[pscustomobject] @{
  CombinedLogContent = Get-Content -Raw log.txt
  InputsThatFailed = ($processInfos | Where-Object TimedOut).InputFile
} | Format-List

# Clean up the overall log file.
Remove-Item log.txt

目前我正在对 mklement0 的答案进行改编。 ebook-convert.exe 经常挂起,所以如果进程花费的时间超过指定时间,我需要将其关闭。 这需要 运行 异步,因为文件数量和占用的处理器时间(5% 到 25%,具体取决于转换)。 超时需要针对每个文件,而不是针对整个作业。

$procsAndLogFiles = 
  Get-Content ./files.txt | ForEach-Object -Begin { $i = 0 } {
    # Create a distinct log file for each process,
    # and return its name along with a process-information object representing
    # each process as a custom object.
    $logFile = 'd:\temp\log{0:000}.txt' -f ++$i
    Write-Host "$(Get-Date) $_"
    [pscustomobject] @{
      LogFile = $logFile
      Process = Start-Process `
        -PassThru `
        -FilePath "c:\Program Files (x86)\calibre2\ebook-convert.exe" `
        -Argumentlist "`"$_`" `"$_.epub`"" `
        -WindowStyle Hidden `
        -RedirectStandardOutput $logFile `
        | Wait-Process -Timeout 30
    }
  }

# Wait for all processes to terminate.
# Add -Timeout and error handling as needed.
$procsAndLogFiles.Process

# Merge all log files.
Get-Content -LiteralPath $procsAndLogFiles.LogFile > log.txt

# Clean up.
Remove-Item -LiteralPath $procsAndLogFiles.LogFile

由于我另一个答案中的问题没有完全解决(没有杀死所有超过超时限制的进程)我在Ruby中重写了它。 它不是 powershell,但如果您解决了这个问题并且还知道 Ruby(或不知道),它可能会对您有所帮助。 我相信是线程的使用解决了杀戮问题。

require 'logger'

LOG        = Logger.new("log.txt")
PROGRAM    = 'c:\Program Files (x86)\calibre2\ebook-convert.exe'
LIST       = 'E:\ebooks\english\_convert\mobi\files.txt'
TIMEOUT    = 30
MAXTHREADS = 6

def run file, log: nil
  output = ""
  command  = %Q{"#{PROGRAM}" "#{file}" "#{file}.epub"  2>&1}
  IO.popen(command+" 2>&1") do |io|
    begin
      while (line=io.gets) do
        output += line
        log.info line.chomp if log
      end
    rescue => ex
        log.error ex.message
      system("taskkill /f /pid #{io.pid}") rescue log.error $@
    end
  end
  if File.exist? "#{file}.epub"
    puts "converted   #{file}.epub" 
    File.delete(file)
  else
    puts "error       #{file}" 
  end
  output
end

threads = []

File.readlines(LIST).each do |file|
    file.chomp! # remove line feed
  # some checks
    if !File.exist? file
        puts "not found   #{file}"
        next
    end
    if File.exist? "#{file}.epub"
        puts "skipping    #{file}"
        File.delete(file) if File.exist? file
        next
    end

    # go on with the conversion
    thread = Thread.new {run(file, log: LOG)}
    threads << thread
    next if threads.length < MAXTHREADS
    threads.each do |t|
        t.join(TIMEOUT)
        unless t.alive?
            t.kill
            threads.delete(t)
        end
    end
end