删除 HTML 内的区域

Remove region within HTML

我想删除 HTML 内的一个自己强加的区域。我已经计算出正确的正则表达式,我已经在 Expresso 中证明了这一点,并且突出显示了正确的部分。

我运行这是在多行模式下,并已将此设置添加到 PowerShell 正则表达式字符串

Get-ChildItem (Get-Item -Path ".\" -Verbose).FullName -Recurse -Filter *.html |
Foreach-Object {
    Write-Host "Checking "$_.FullName
    $content = Get-Content $_.FullName
    $content = $content -replace "(?m)(^.*\#region REMOVE.*)[.|\n|\W|\w]*(^.*\#endregion REMOVE.*)",""
    Set-Content $content -Path $_.FullName
}

遗憾的是,尽管文件已被修改,但该区域并未被删除。

来自Get-Content documentation

The Get-Content cmdlet gets the content of the item at the location specified by the path, such as the text in a file. It reads the content one line at a time and returns a collection of objects, each of which represents a line of content.

所以你的正则表达式是在一个数组上执行的,而不是文件内容的字符串。更改为:

$content = [System.IO.File]::ReadAllText($_.FullName);

它会起作用。