递归地将文件扩展名从 .docx 和 .pdf 更改为 .txt

Recursively Changing File extensions from .docx and .pdf to .txt

$findPDF = Get-ChildItem -Path "$fileDrive" -Filter *.pdf -r 
$findDOCX = Get-ChildItem -Path "$fileDrive" -Filter *.docx -r

$pullFiles += $findPDF
$pullFiles += $findDOCX
#[array]$pullFiles 

#$pullFiles.length

$holdPath = @()
for($i = 0; $i -lt $pullFiles.length; $i++){
        #get the full path of each document
        $fullPath = Resolve-Path $pullFiles.fullname[$i]
        #stores the information in a global array
        $holdPath += $fullPath.path
}
#$holdPath

<#
.DESCRIPTION Uses the word.APPLICATION object to open and convert the word documents into .txt.
#>

#
#wdFormatDOSTextLineBreaks  5   Microsoft DOS text with line breaks preserved.


foreach($fi in $holdPath){
    $Doc = $word.Documents.Open($fi.name)

    $NameDOCX = ($Doc.name).replace("docx","txt")
    $Doc.saveas([ref] $NameDOCX, [ref] 5)

    $NamePDF = ($Doc.name).replace("pdf","txt")
    $Doc.saveas([ref] $NamePDF, [ref] 5)

    $Doc.close()
}

问题陈述 该程序需要获取任何 pdf 和 doc/x 文件并将其转换为 .txt 文件。现在,我可以递归搜索并从文件系统中提取所有 .docx 和 .pdf 文档。现在,我只需要转换它们。

错误

You cannot call a method on a null-valued expression.
At C:\Users\p617824\Documents\files\powershell\fileExtRename.ps1:38 char:2
+     $Doc = $word.Documents.Open($fi.name)
+     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : InvalidOperation: (:) [], RuntimeException
    + FullyQualifiedErrorId : InvokeMethodOnNull

You cannot call a method on a null-valued expression.
At C:\Users\p617824\Documents\files\powershell\fileExtRename.ps1:40 char:2
+     $NameDOCX = ($Doc.name).replace("docx","txt")
+     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : InvalidOperation: (:) [], RuntimeException
    + FullyQualifiedErrorId : InvokeMethodOnNull

[ref] cannot be applied to a variable that does not exist.
At C:\Users\p617824\Documents\files\powershell\fileExtRename.ps1:41 char:2
+     $Doc.saveas([ref] $NameDOCX, [ref] 5)
+     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : InvalidOperation: (NameDOCX:VariablePath) [], RuntimeException
    + FullyQualifiedErrorId : NonExistingVariableReference

You cannot call a method on a null-valued expression.
At C:\Users\p617824\Documents\files\powershell\fileExtRename.ps1:43 char:2
+     $NamePDF = ($Doc.name).replace("pdf","txt")
+     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : InvalidOperation: (:) [], RuntimeException
    + FullyQualifiedErrorId : InvokeMethodOnNull

[ref] cannot be applied to a variable that does not exist.
At C:\Users\p617824\Documents\files\powershell\fileExtRename.ps1:44 char:2
+     $Doc.saveas([ref] $NamePDF, [ref] 5)
+     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : InvalidOperation: (NamePDF:VariablePath) [], RuntimeException
    + FullyQualifiedErrorId : NonExistingVariableReference

You cannot call a method on a null-valued expression.
At C:\Users\p617824\Documents\files\powershell\fileExtRename.ps1:46 char:2
+     $Doc.close()
+     ~~~~~~~~~~~~
    + CategoryInfo          : InvalidOperation: (:) [], RuntimeException
    + FullyQualifiedErrorId : InvokeMethodOnNull

$word变量没有初始化,你的复杂是为了Nothing(无意冒犯你)。像这样修改所有脚本:

$fileDrive ="C:\temp"
$word = new-object -ComObject Word.Application 

Get-ChildItem -Path $fileDrive -file -r -Include "*.docx", "*.pdf"  | %{

    $Doc = $word.Documents.Open($_.FullName)

    $NameDOC = $_.FullName.replace(".docx",".txt").replace(".pdf",".txt")

    $Doc.saveas([ref] $NameDOC, [ref] 5)
    $Doc.close()

}

$word.Quit()

但是我对使用这样的 Word 应用程序将 pdf 转换为 .txt 有疑问...我认为您应该像 here

这样使用 itextsharp 库