根据内容在不同的打印机上打印 pdf 文件

Question

我想在不同的打印机上打印 .pdf 文件 - 取决于它们的内容。如何检查文件中是否存在特定的单词？到目前为止，为了对文件夹的内容进行排队，我已经构建了以下内容：

Unblock-File -Path S:\test\itextsharp.dll
Add-Type -Path S:\test\itextsharp.dll
$files = Get-ChildItem S:\test\*.pdf
$adobe='C:\Program Files (x86)\Adobe\Acrobat DC\Acrobat\Acrobat.exe'
foreach ($file in $files) {
  $reader = [iTextSharp.text.pdf.parser.PdfTextExtractor]
  $Extract = $reader::GetTextFromPage($File.FullName,1)
  if ($Extract -Contains 'Lieferschein') {
    Write-Host -ForegroundColor Yellow "Lieferschein"
    $printername='XX1'
    $drivername='XX1'
    $portname='192.168.X.41'
  } else {
    Write-Host -ForegroundColor Yellow "Etikett"
    $printername='XX2'
    $drivername='XX2'
    $portname='192.168.X.42'
  }
  $arglist = '/S /T "' + $file.FullName + '" "' + $printername + '" "' + $drivername + " " + $portname
  start-process $adobe -argumentlist $arglist -wait
  Start-Sleep -Seconds 15
  Remove-Item $file.FullName
}

现在我遇到了 2 个问题：

1st: Add-Type -Path itextsharp.dll 给我一个错误。

Add-Type: One or more types in the assembly cannot be loaded. Get the LoaderExceptions property for more information. In line: 2 character: 1

我了解到这可能是由于文件被阻止。但是在属性中没有关于此的信息。 Unblock-File 命令和开始没有 change/solve 任何东西。

使用 $error[0].exception.loaderexceptions[0] 后，我得到 BouncyCastle.Crypto, Version=1.8.6.0 丢失的信息。 不幸的是，我还找不到任何资源。

2nd：if ($Extract -Contains 'Lieferschein') 会按我的预期工作吗？它会在 Add-Type 加载成功后检查短语吗？

或者：也有可能使其依赖于内容的格式。例如，一种文件的大小为 DIN A4。另一个比那个小。如果有更简单的方法来检查它，你也会让我开心。

提前致谢！

Answer 1

正在使用 Powershell 和 iTextSharp.dll 在 pdf 中搜索关键字。这是很常见的事情。然后您只需使用您的条件逻辑发送到您选择的任何打印机。所以，应该做这样的事情。

Add-Type -Path 'C:\path_to_dll\itextsharp.dll'

$pdfs     = Get-ChildItem 'C:\path_to_pdfs' -Filter '*.pdf'
$export   = 'D:\Temp\PdfExport.csv'
$results  = @()
$keywords = @('Keyword1')

foreach ($pdf in $pdfs)
{
    "processing - $($pdf.FullName)"
    $reader = New-Object iTextSharp.text.pdf.pdfreader -ArgumentList $pdf.FullName

    for ($page = 1; $page -le $reader.NumberOfPages; $page++)
    {
        $pageText = [iTextSharp.text.pdf.parser.PdfTextExtractor]::GetTextFromPage($reader, $page).Split([char]0x000A)

        foreach ($keyword in $keywords)
        {
            if ($pageText -match $keyword)
            {
                $response = @{
                    keyword = $keyword
                    file    = $pdf.FullName
                    page    = $page
                }
                $results += New-Object PSObject -Property $response
            }
        }
    }

    $reader.Close()
}

"`ndone"

$results | 
Export-Csv $export -NoTypeInformation

更新

根据您的评论，关于您的错误。

同样，iTextSharp 是一个遗留问题，您确实需要迁移到 iText7。

不过，这不是 PowerShell 代码问题。这是一个 iTextSharp.dll 缺失的依赖项。即使使用 iText7，您也需要确保在您的机器上拥有所有依赖项并正确加载。

如本 SO 问答中所述：

Answer 2

第一:

在 nuget.org 上找到正确的版本 (1.8.6) 后，Add-Type 命令可以完美运行。正如预期的那样，我什至不需要 unblock 命令，因为它没有在属性中标记为被阻止的文件。现在脚本开始于：

Add-Type -Path 'c:\BouncyCastle.Crypto.dll'
Add-Type -Path 'c:\itextsharp.dll'

第二

关于检查队列：我只需要在 if 子句中将 -contains 替换为 -match。

if ($Extract -Contains 'Lieferschein')

根据内容在不同的打印机上打印 pdf 文件

Print pdf files on different printers depending on their content

printing

pdf

powershell

extract

itext