如何使用 PowerShell 截断已知地址后的二进制文件末尾？

Question

对于冗长的 post，我预先表示歉意，但我正在尝试包括我迄今为止使用和测试过的脚本。我对使用二进制文件和 PowerShell 也很陌生 - 我在这里拔头发。我有一个文件，我必须在其中删除已知地址到文件末尾的数据。我在 S.O 上引用了多篇文章。但似乎最接近我想要完成的目标是 which links to an article I found as well here.

我觉得我真的很接近，但我不确定我是否正确使用了该函数，因为我在确定正则表达式中的十六进制等效“.*”时遇到了一些麻烦找到0个或多个匹配删除从已知地址到文件末尾的剩余数据。也许我想得太复杂了？

我的已知地址始终是 005A08B0，此后没有任何内容具有可重复的模式，因此我不能简单地使用 \xF0\x00\x01 或类似模式来搜索。

脚本的这一部分没有改变 - 我假设的功能仍然是相同的，并且在松散的层面上，我理解它在做什么 - 流式传输指定的文件并转到文件末尾查找匹配的正则表达式模式的数量：

function ConvertTo-BinaryString {
    # converts the bytes of a file to a string that has a
    # 1-to-1 mapping back to the file's original bytes. 
    # Useful for performing binary regular expressions.
    [OutputType([String])]
    Param (
        [Parameter(Mandatory = $True, ValueFromPipeline = $True, Position = 0)]
        [ValidateScript( { Test-Path $_ -PathType Leaf } )]
        [String]$Path
    )

    $Stream = New-Object System.IO.FileStream -ArgumentList $Path, 'Open', 'Read'

    # Note: Codepage 28591 returns a 1-to-1 char to byte mapping
    $Encoding     = [Text.Encoding]::GetEncoding(28591)
    $StreamReader = New-Object System.IO.StreamReader -ArgumentList $Stream, $Encoding
    $BinaryText   = $StreamReader.ReadToEnd()

    $StreamReader.Close()
    $Stream.Close()

    return $BinaryText
}

我的输入文件的这一部分非常容易理解：

$inputFile  = 'C:\StartFile.dat'
$outputFile = 'C:\EndFile_test.dat'
$fileBytes  = [System.IO.File]::ReadAllBytes($inputFile)
$binString  = ConvertTo-BinaryString -Path $inputFile

这是事情崩溃的地方，我认为这将是我必须真正修改的唯一部分：

# This is the portion I am having a problem with - what do I need to do for this regex???
$re = [Regex]'[\x5A08B0]{30}*'

这部分似乎我不需要修改太多，因为位置会自然地在文件中移动并在每次找到匹配项后自行偏移？

# use a MemoryStream object to store the result
$ms  = New-Object System.IO.MemoryStream
$pos = $replacements = 0

$re.Matches($binString) | ForEach-Object {
    # write the part of the byte array before the match to the MemoryStream
    $ms.Write($fileBytes, $pos, $_.Index)
    # update the 'cursor' position for the next match
    $pos += ($_.Index + $_.Length)
    # and count the number of replacements done
    $replacements++
}

# write the remainder of the bytes to the stream
$ms.Write($fileBytes, $pos, $fileBytes.Count - $pos)

# save the updated bytes to a new file (will overwrite existing file)
[System.IO.File]::WriteAllBytes($outputFile, $ms.ToArray())
$ms.Dispose()

if ($replacements) {
    Write-Host "$replacements replacement(s) made."
}
else {
    Write-Host "Byte sequence not found. No replacements made."
}

此外，我还尝试了以下方法，至少看看我是否可以确定在已知文件上引用了适当的地址，这似乎是一个不同的开始：

#Decimal Equivalent of the Hex Address:
$offset = 5900464

$bytes = [System.IO.File]::ReadAllBytes("C:TestFile.dat");
Echo $bytes[$offset]

当我运行上面的小脚本时，我至少得到了已知文件的正确字符 - 它产生了文件中 Ascii 字符的十进制等价物。

我可以使用十六进制编辑器手动执行此操作，但这必须通过脚本才能实现。 . .感谢我能得到的所有帮助。一些披露 - 它必须使用 windows 7/windows 10 的原生程序来完成 - 不能下载任何单独的可执行文件，而且 SysInternals 也不行。最初是在寻找批处理文件的想法，但我可以轻松地将 PowerShell 命令移植到批处理文件中。

Answer 1

要简单地截断一个文件，即删除超出给定字节偏移量的任何内容，您可以使用System.IO.File's static OpenWrite() method to obtain a System.IO.FileStream实例并调用其 .SetLength() 方法：

$inputFile  = 'C:\StartFile.dat'
$outputFile = 'C:\EndFile_test.dat'

# First, copy the input file to the output file.
Copy-Item -LiteralPath $inputFile -Destination $outputFile

# Open the output file for writing.
$fs = [System.IO.File]::OpenWrite($outputFile)

# Set the file length based on the desired byte offset
# in order to truncate it (assuming it is larger).
$fs.SetLength(0x5A08B0)

$fs.Close()

注意：如果给定的偏移量等于增加文件的大小，那么额外的space似乎填充了NUL（0x0 ) 字节，作为 macOS 上的快速测试和 Windows 建议；但是，根据 .SetLength() documentation:

判断，这种行为似乎 不能保证

If the stream is expanded, the contents of the stream between the old and the new length are undefined.

如何使用 PowerShell 截断已知地址后的二进制文件末尾？

How to truncate the end of a binary file past known address using PowerShell?

binary

powershell