我无法使用 Powershell 从 A Word 文档中提取超链接
I cannot extract hyperlinks from A Word doc with Powershell
我正在尝试递归搜索目录结构以查找 word 文档,然后提取超链接。代码执行时输出如下:
processing 2 docs
File Name Hyperlink
--------- ---------
C:\temp\doc1.docx
C:\temp\doc1.docx
C:\temp\folder\doc2.docx
C:\temp\folder\doc2.docx
我尝试过的任何方法似乎都不起作用。我试过使用:
- “超链接”= $_Address
- “超链接”= $thisDoc.Address
- “超链接”= $thisDoc.Hyperlink.Address
Clear-Host
$parentFolder = "C:\temp"
$ourDocs = Get-ChildItem -Recurse -LiteralPath $parentFolder -file -include *.doc*
"processing {0} docs" -f $ourDocs.Count
$word = New-Object -ComObject word.application
$word.Visible = $false
$word.ScreenUpdating = $false
$array = New-Object System.Collections.ArrayList
$ourDocs | ForEach-Object{
$thisDoc = $word.Documents.Open($_.FullName)
$thisDoc.Hyperlinks | ForEach-Object {
$array.Add([pscustomobject]@{
"File Name" = $thisDoc.FullName
"Hyperlink" = $_Address}) | Out-null
}
$thisDoc.Close()
}
$Word.Quit()
$array
# cleanup com objects
[System.Runtime.Interopservices.Marshal]::ReleaseComObject($word) | Out-Null
[System.GC]::Collect()
[System.GC]::WaitForPendingFinalizers()
错误在于你如何为你想要的 属性 值调用它。
试试这个...重构
Clear-Host
$parentFolder = "D:\temp\Word"
$ourDocs = Get-ChildItem -Recurse -LiteralPath $parentFolder -file -include '*.doc*'
"processing {0} docs" -f $ourDocs.Count
$word = New-Object -ComObject word.application
$word.Visible = $false
$word.ScreenUpdating = $false
# This really is not needed for your posted use case.
# $array = New-Object System.Collections.ArrayList
$ourDocs |
ForEach-Object{
$thisDoc = $word.Documents.Open($PSItem.FullName)
@($thisDoc.Hyperlinks) |
ForEach-Object {
[pscustomobject]@{
FileName = $thisDoc.FullName
HyperLink = $PSitem.Address
}
}
$thisDoc.Close()
}
$Word.Quit()
# cleanup com objects
[System.Runtime.Interopservices.Marshal]::ReleaseComObject($word) | Out-Null
[System.GC]::Collect()
[System.GC]::WaitForPendingFinalizers()
# Results
<#
processing 4 docs
FileName HyperLink
-------- ---------
D:\temp\Word\WES - Copy.docx http://stackoverfow.com/
D:\temp\Word\WES - Copy.docx https://superuser.com/questions/tagged/powershell
#>
更新你的 Csv 评论和我对它的回复...
...
$ourDocs |
ForEach-Object{
$thisDoc = $word.Documents.Open($PSItem.FullName)
@($thisDoc.Hyperlinks) |
ForEach-Object {
[pscustomobject]@{
FileName = $thisDoc.FullName
HyperLink = $PSitem.Address
}
} |
Export-Csv -Path 'D:\Temp\WordHyperLinkReport.csv' -Append -NoTypeInformation
$thisDoc.Close()
}
...
Import-Csv -Path 'D:\Temp\WordHyperLinkReport.csv'
# Results
<#
FileName HyperLink
-------- ---------
D:\temp\Word\WES - Copy.docx http://stackoverfow.com/
D:\temp\Word\WES - Copy.docx https://superuser.com/questions/tagged/powershell
#>
我正在尝试递归搜索目录结构以查找 word 文档,然后提取超链接。代码执行时输出如下:
processing 2 docs
File Name Hyperlink
--------- ---------
C:\temp\doc1.docx
C:\temp\doc1.docx
C:\temp\folder\doc2.docx
C:\temp\folder\doc2.docx
我尝试过的任何方法似乎都不起作用。我试过使用:
- “超链接”= $_Address
- “超链接”= $thisDoc.Address
- “超链接”= $thisDoc.Hyperlink.Address
Clear-Host
$parentFolder = "C:\temp"
$ourDocs = Get-ChildItem -Recurse -LiteralPath $parentFolder -file -include *.doc*
"processing {0} docs" -f $ourDocs.Count
$word = New-Object -ComObject word.application
$word.Visible = $false
$word.ScreenUpdating = $false
$array = New-Object System.Collections.ArrayList
$ourDocs | ForEach-Object{
$thisDoc = $word.Documents.Open($_.FullName)
$thisDoc.Hyperlinks | ForEach-Object {
$array.Add([pscustomobject]@{
"File Name" = $thisDoc.FullName
"Hyperlink" = $_Address}) | Out-null
}
$thisDoc.Close()
}
$Word.Quit()
$array
# cleanup com objects
[System.Runtime.Interopservices.Marshal]::ReleaseComObject($word) | Out-Null
[System.GC]::Collect()
[System.GC]::WaitForPendingFinalizers()
错误在于你如何为你想要的 属性 值调用它。
试试这个...重构
Clear-Host
$parentFolder = "D:\temp\Word"
$ourDocs = Get-ChildItem -Recurse -LiteralPath $parentFolder -file -include '*.doc*'
"processing {0} docs" -f $ourDocs.Count
$word = New-Object -ComObject word.application
$word.Visible = $false
$word.ScreenUpdating = $false
# This really is not needed for your posted use case.
# $array = New-Object System.Collections.ArrayList
$ourDocs |
ForEach-Object{
$thisDoc = $word.Documents.Open($PSItem.FullName)
@($thisDoc.Hyperlinks) |
ForEach-Object {
[pscustomobject]@{
FileName = $thisDoc.FullName
HyperLink = $PSitem.Address
}
}
$thisDoc.Close()
}
$Word.Quit()
# cleanup com objects
[System.Runtime.Interopservices.Marshal]::ReleaseComObject($word) | Out-Null
[System.GC]::Collect()
[System.GC]::WaitForPendingFinalizers()
# Results
<#
processing 4 docs
FileName HyperLink
-------- ---------
D:\temp\Word\WES - Copy.docx http://stackoverfow.com/
D:\temp\Word\WES - Copy.docx https://superuser.com/questions/tagged/powershell
#>
更新你的 Csv 评论和我对它的回复...
...
$ourDocs |
ForEach-Object{
$thisDoc = $word.Documents.Open($PSItem.FullName)
@($thisDoc.Hyperlinks) |
ForEach-Object {
[pscustomobject]@{
FileName = $thisDoc.FullName
HyperLink = $PSitem.Address
}
} |
Export-Csv -Path 'D:\Temp\WordHyperLinkReport.csv' -Append -NoTypeInformation
$thisDoc.Close()
}
...
Import-Csv -Path 'D:\Temp\WordHyperLinkReport.csv'
# Results
<#
FileName HyperLink
-------- ---------
D:\temp\Word\WES - Copy.docx http://stackoverfow.com/
D:\temp\Word\WES - Copy.docx https://superuser.com/questions/tagged/powershell
#>