使用 pdftk 拆分 pdf(已关闭)
Split pdf with pdftk (closed)
我正在文件夹中的多个 pdf 中搜索某个关键字。如果找到关键字,我想在特定页面拆分 pdf 并将其另存为新 pdf。
代码:
Add-Type -Path '...\itextsharp.5.5.13.1 (1)\lib\itextsharp.dll'
$pdfs = gci "C:\Users\..\Plan\" *.pdf
$keywords = "TEST"
$pdftk = "C:\Program Files (x86)\PDFtk\bin\pdftk.exe"
$output = "C:\Users\...\new"
$newpdf = New-Object -TypeName psobject
foreach($pdf in $pdfs) {
Write-Host "processing -" $pdf.FullName
# prepare the pdf
$reader = New-Object iTextSharp.text.pdf.pdfreader -ArgumentList $pdf.FullName
# for each page
for($page = 1; $page -le $reader.NumberOfPages; $page++) {
# set the page text
$pageText = [iTextSharp.text.pdf.parser.PdfTextExtractor]::GetTextFromPage($reader,$page).Split([char]0x000A)
# if the page text contains keyword
if($pageText -match $keywords) {
break
}
}
#$reader.Close()
$FirstPage = $page
$LastPage = $reader.NumberOfPages
Write-Host "Starting page is: " $FirstPage
Write-Host "Last page is: " $LastPage
& $pdftk $pdf.FullName cat $FirstPage-end output "$output\test.pdf"
}
您缺少 output
关键字。使用
$pdftk $pdf.FullName cat $FirstPage-end output "$output\test.pdf"
您收到此奇怪错误的原因是 C 被解释为句柄,即指向先前输入文件的指针。
我正在文件夹中的多个 pdf 中搜索某个关键字。如果找到关键字,我想在特定页面拆分 pdf 并将其另存为新 pdf。 代码:
Add-Type -Path '...\itextsharp.5.5.13.1 (1)\lib\itextsharp.dll'
$pdfs = gci "C:\Users\..\Plan\" *.pdf
$keywords = "TEST"
$pdftk = "C:\Program Files (x86)\PDFtk\bin\pdftk.exe"
$output = "C:\Users\...\new"
$newpdf = New-Object -TypeName psobject
foreach($pdf in $pdfs) {
Write-Host "processing -" $pdf.FullName
# prepare the pdf
$reader = New-Object iTextSharp.text.pdf.pdfreader -ArgumentList $pdf.FullName
# for each page
for($page = 1; $page -le $reader.NumberOfPages; $page++) {
# set the page text
$pageText = [iTextSharp.text.pdf.parser.PdfTextExtractor]::GetTextFromPage($reader,$page).Split([char]0x000A)
# if the page text contains keyword
if($pageText -match $keywords) {
break
}
}
#$reader.Close()
$FirstPage = $page
$LastPage = $reader.NumberOfPages
Write-Host "Starting page is: " $FirstPage
Write-Host "Last page is: " $LastPage
& $pdftk $pdf.FullName cat $FirstPage-end output "$output\test.pdf"
}
您缺少 output
关键字。使用
$pdftk $pdf.FullName cat $FirstPage-end output "$output\test.pdf"
您收到此奇怪错误的原因是 C 被解释为句柄,即指向先前输入文件的指针。