如何用 base64 代码替换 html 文件中的所有图像? (电源外壳)

How to replace all images in an html file with their base64-codes? (Powershell)

我使用一个名为 Belarc Avisor 的软件,它以 html 格式提供所有硬件软件详细信息的 html 输出,包括已安装软件的 licenses/keys/serials。我通常在新 PC 上或在格式化 PC 之前通过该软件创建此报告。然而,chrome 导出的文件使用单独的图像文件夹,我需要一个包含所有详细信息和图像的独立 html 文件(包括 html 报告的 css 样式) .

我目前不得不用从在线网站生成的 base64 代码替换 notepad++ 中的图像。我正在寻找在批处理脚本或 Powershell 中执行此操作的替代方法。我发现了两个 Whosebug 问题 {q1}, {q2}, and a {blog-post},并且有以下代码:

    $original_file = 'path\filename.html'
    $destination_file =  'path\filename.new.html'
    (Get-Content $original_file) | Foreach-IMG-SELECTOR-Object {
        $path = $_ SOURCE-TAG-SELECTOR `
        -replace $path, [convert]::ToBase64String((get-content $path -encoding byte))
    } | Set-Content $destination_file

Foreach-Object中,也许可以通过html img 标签选择对象?如果是,那么base64转换就很简单了!

转换为base64,字符串为: [convert]::ToBase64String((get-content $path -encoding byte))

其中 $path 是图像的路径。可以从 <img src=""> 标签复制。

我刚读到 Windows 10 有 Powershell 5.0,所以我想我可以创建一个批处理文件来创建它。

所以如果img标签和src属性可以被选择,他们只需要被他们的base64标签替换。

答案的修改版本

Alexendar 提供的答案无效,因为在循环期间,属性值被设置为#Document,而它应该被设置为当前节点。在网上搜索和查看 Powershell 控制台后,我发现可以通过 Selecting the current node via XPath 来解决这个问题。这是修改后的答案:

Import-Module -Name "C:\HtmlAgilityPack.1.4.6\Net40\HtmlAgilityPack.dll" # Change to your actual path

function Convert_to_Base64 ($sImgFile)
{
#$sImgFile = "C:\image.jpg" # Change to your actual path
$oImgFormat = [System.Drawing.Imaging.ImageFormat]::Gif # Change to your format

$oImage = [System.Drawing.Image]::FromFile($sImgFile)
$oMemoryStream = New-Object -TypeName System.IO.MemoryStream
$oImage.Save($oMemoryStream, $oImgFormat)
$cImgBytes = [Byte[]]($oMemoryStream.ToArray())
$sBase64 = [System.Convert]::ToBase64String($cImgBytes)

$sBase64
}


$sInFile = "C:\Users\USER\Desktop\BelarcAdvisor win10\Belarc Advisor Computer Profile.html" # Change to your actual path
$sOutFile = "D:\Win10-Belarc.html" # Change to your actual path
$sPathBase = "C:\Users\USER\Desktop\BelarcAdvisor win10\"

$sXpath = "//img"
$sAttributeName = "src"

$oHtmlDocument = New-Object -TypeName HtmlAgilityPack.HtmlDocument
$oHtmlDocument.Load($sInFile)
$oHtmlDocument.DocumentNode.SelectNodes($sXpath) | ForEach-Object {
    # If you need to download the image, here's how you can extract the image
    # URI (note that it may be realtive, not absolute):

    $sVarXPath = $_ #To get the Current Node and then later get Attributes + XPathXPath from this node variable.

    #$sVarXPath.XPath

    $sSrcPath = $sVarXPath.get_Attributes() `
        | Where-Object { $_.Name -eq $sAttributeName } `
        | Select-Object -ExpandProperty "Value"
    # Assembling absolute URI:
    $sUri = Join-Path -Path $sPathBase -ChildPath $sSrcPath.substring(2) #substring for "./" in the src string of the img in subfolder.
    #$sUri
    # Now you can d/l the image: Invoke-WebRequest -Uri $sUri
    #[System.Drawing.Image]::FromFile($sUri)

    # Put your Base64 conversion code here.
    $sBase64 = Convert_to_Base64($sUri)

    $sSrcValue = "data:image/png;base64," + $sBase64
    $oHtmlDocument.DocumentNode.SelectNodes($sVarXPath.XPath).SetAttributeValue($sAttributeName, $sSrcValue)
    #$oHtmlDocument.DocumentNode.SelectNodes($sVarXPath.XPath).GetAttributeValue($sAttributeName, "")
}

#$oHtmlDocument.DocumentNode.SelectNodes($sXpath) | foreach-object { write-output $_ }

$oHtmlDocument.Save($sOutFile)

这很容易。您可以使用 HtmlAgilityPack 来解析 HTML:

Import-Module -Name "C:\HtmlAgilityPack.dll" # Change to your actual path

$sInFile = "E:\Temp\test.html" # Change to your actual path
$sOutFile = "E:\temp\test1.html" # Change to your actual path
$sUriBase = "http://example.com/" # Change to your actual URI base

$sXpath = "//img"
$sAttributeName = "src"

$oHtmlDocument = New-Object -TypeName HtmlAgilityPack.HtmlDocument
$oHtmlDocument.Load($sInFile)
$oHtmlDocument.DocumentNode.SelectNodes($sXpath) | ForEach-Object {
    # If you need to download the image, here's how you can extract the image
    # URI (note that it may be realtive, not absolute):
    $sSrcPath = $_.get_Attributes() `
        | Where-Object { $_.Name -eq $sAttributeName } `
        | Select-Object -ExpandProperty "Value"
    # Assembling absolute URI:
    $sUri = $sUriBase + $sSrcPath
    # Now you can d/l the image: Invoke-WebRequest -Uri $sUri


    # Put your Base64 conversion code here.
    $sBase64 = ...

    $sSrcValue = "data:image/png;base64," + $sBase64
    $_.SetAttributeValue($sAttributeName, $sSrcValue)
}

$oHtmlDocument.Save($sOutFile)

Converting image file to Base64 string:

$sImgFile = "C:\image.jpg" # Change to your actual path
$oImgFormat = [System.Drawing.Imaging.ImageFormat]::Jpeg # Change to your format

$oImage = [System.Drawing.Image]::FromFile($sImgFile)
$oMemoryStream = New-Object -TypeName System.IO.MemoryStream
$oImage.Save($oMemoryStream, $oImgFormat)
$cImgBytes = [Byte[]]($oMemoryStream.ToArray())
$sBase64 = [System.Convert]::ToBase64String($cImgBytes)

部分答案在这里,但我能够在一行中执行从单个图像文件到具有数据 URI 编码的输出文本文件的转换:

"data:image/png;base64," + [convert]::tobase64string([io.file]::readallbytes(($pwd).path + "\image.png")) | set-content -encoding ascii "image.txt"

(请注意,输出文件编码似乎有所不同。)

主要是因为这是我在网络搜索中得到的结果,而且它还简化了亚历山大回答中的转换。