Findstr 和 REGEX:如何定义行与行之间可能有很多附加字符?

Findstr and REGEX: how to define that it could be a lot of additional characters between the lines?

我是 windows 批处理正则表达式的新手。我有以下日志文​​件:

14:35:32 Destination File: C:\Users\NekhayenkoO\Desktop\LOG Dateien CD Imaging\SME99.ISO

14:34:43 Operation failed! - Duration: 00:01:05
14:37:02 Average Read Rate: 3.573 KiB/s (20.7x) - Maximum Read Rate: 4.812 KiB/s (27.9x)

我想借助 findstr 和正则表达式检查 "Operation failed!" 是否出现在 .ISO 和 "Average Read Rate" 之间。在所有这三行之间可能出现来自 LOG 的其他行。这就是我试图做的:

@echo off

findstr /R /N /C:"Operation Failed!" ImgBurn.log

pause

编辑 1:实际上整个日志文件如下所示:

; //****************************************\
;   ImgBurn Version 2.5.8.0 - Log
;   Mittwoch, 14 Juni 2017, 19:30:05
; \****************************************//
;
;
I 14:30:49 ImgBurn Version 2.5.8.0 started!
I 14:30:49 Microsoft Windows 8 Professional x64 Edition (6.2, Build 9200)
I 14:30:49 Total Physical Memory: 16.680.588 KiB  -  Available: 8.279.396 KiB
I 14:30:49 Initialising BS_Robots...
I 14:30:49 BS_SDK Version 2.2.0.277  Build 2013.02.22
I 14:30:49 Initialising SPTI...
I 14:30:49 Searching for Auto Loader devices...
I 14:31:07 -> Auto Loader 1 - Info: Nimbie NB21 1.13.11.26
I 14:31:07 Found 1 Auto Loader!
I 14:31:07 Searching for SCSI / ATAPI devices...
I 14:31:07 -> Drive 1 - Info: ASUS BW-16D1HT 1.01 (E:) (USB 2.0)
I 14:31:08 -> Drive 2 - Info: PLDS DVD-ROM DH-16D7S WD11 (D:) (SATA)
I 14:31:08 Found 1 DVD-ROM and 1 BD-RE XL!
I 14:33:22 Operation Started!
I 14:33:22 Source Device: [0:0:0] ASUS BW-16D1HT 1.01 (E:) (USB)
I 14:33:22 Source Media Type: CD-ROM
I 14:33:22 Source Media Supported Read Speeds: 4x; 8x; 10x; 16x; 24x; 32x; 40x; 48x
I 14:33:22 Source Media Supported Write Speeds: 16x
I 14:33:22 Source Media Sectors: 15.735
I 14:33:22 Source Media Size: 32.225.280 bytes
I 14:33:22 Source Media Volume Identifier: DGZfP BB66
I 14:33:22 Source Media File System(s): ISO9660; Joliet
I 14:33:22 Read Speed (Data/Audio): MAX / 40x
I 14:33:22 Destination File: C:\Users\NekhayenkoO\Desktop\LOG Dateien CD Imaging\DGZfP BB66.iso
I 14:33:22 Destination Free Space: 338.360.074.240 Bytes (330.429.760,00 KiB) (322.685,31 MiB) (315,12 GiB)
I 14:33:22 Destination File System: NTFS
I 14:33:23 File Splitting: Auto
I 14:33:34 Read Speed - Effective: 4x
I 14:33:35 Reading Session 1 of 1... (1 Track, LBA: 0 - 15736)
I 14:33:35 Reading Track 1 of 1... (MODE1/2048, LBA: 0 - 15736)
I 14:34:43 Image MD5: 40835c9c32af1b954fbd1e8eb878ab3c
I 14:34:43 Exporting Graph Data...
I 14:34:43 Graph Data File: C:\Users\NekhayenkoO\AppData\Roaming\ImgBurn\Graph Data Files\ASUS_BW-16D1HT_1.01_MITTWOCH-14-JUNI-2017_14-33_N-A.ibg
I 14:34:43 Export Successfully Completed!
I 14:34:43 Operation Successfully Completed! - Duration: 00:01:05
I 14:34:43 Average Read Rate: 484 KiB/s (2.8x) - Maximum Read Rate: 639 KiB/s (3.7x)
I 14:35:32 Operation Started!
I 14:35:32 Source Device: [0:0:0] ASUS BW-16D1HT 1.01 (E:) (USB)
I 14:35:32 Source Media Type: CD-ROM
I 14:35:32 Source Media Supported Read Speeds: 4x; 8x; 10x; 16x; 24x; 32x; 40x; 48x
I 14:35:32 Source Media Supported Write Speeds: 48x
I 14:35:32 Source Media Sectors: 128.648
I 14:35:32 Source Media Size: 263.471.104 bytes
I 14:35:32 Source Media Volume Identifier: SME99
I 14:35:32 Source Media File System(s): ISO9660
I 14:35:32 Read Speed (Data/Audio): MAX / 40x
I 14:35:32 Destination File: C:\Users\NekhayenkoO\Desktop\LOG Dateien CD Imaging\SME99.ISO
I 14:35:32 Destination Free Space: 338.327.040.000 Bytes (330.397.500,00 KiB) (322.653,81 MiB) (315,09 GiB)
I 14:35:32 Destination File System: NTFS
I 14:35:32 File Splitting: Auto
I 14:35:34 Read Speed - Effective: 48x
I 14:35:35 Reading Session 1 of 1... (1 Track, LBA: 0 - 128647)
I 14:35:35 Reading Track 1 of 1... (MODE1/2048, LBA: 0 - 128647)
I 14:37:02 Image MD5: 8bb58d62b8e4952dc2f029c8d50ac984
I 14:37:02 Exporting Graph Data...
I 14:37:02 Graph Data File: C:\Users\NekhayenkoO\AppData\Roaming\ImgBurn\Graph Data Files\ASUS_BW-16D1HT_1.01_MITTWOCH-14-JUNI-2017_14-35_N-A.ibg
I 14:37:02 Export Successfully Completed!
I 14:37:02 Operation Successfully Completed! - Duration: 00:01:12
I 14:37:02 Average Read Rate: 3.573 KiB/s (20.7x) - Maximum Read Rate: 4.812 KiB/s (27.9x)
I 14:37:57 Operation Started!
I 14:37:57 Source Device: [0:0:0] ASUS BW-16D1HT 1.01 (E:) (USB)
I 14:37:57 Source Media Type: DVD+R (Book Type: DVD-ROM) (Disc ID: MCC-003-00)
I 14:37:57 Source Media Supported Read Speeds: 2x; 4x; 6,3x; 8,3x; 10,3x; 12,1x
I 14:37:57 Source Media Supported Write Speeds: 4x
I 14:37:57 Source Media Sectors: 1.649.280 (Track Path: PTP)
I 14:37:57 Source Media Size: 3.377.725.440 bytes
I 14:37:57 Source Media Volume Identifier: Biomasse
I 14:37:57 Source Media Volume Set Identifier: 41628eeb        Biomasse
I 14:37:57 Source Media Application Identifier: SONIC SOLUTIONS IMAGESCRIPT
I 14:37:57 Source Media Implementation Identifier: DVD Producer 1.0
I 14:37:57 Source Media File System(s): ISO9660; UDF (1.02)
I 14:37:57 Read Speed (Data/Audio): MAX / 40x
I 14:37:57 Destination File: C:\Users\NekhayenkoO\Desktop\LOG Dateien CD Imaging\Biomasse.iso
I 14:37:57 Destination Free Space: 338.066.927.616 Bytes (330.143.484,00 KiB) (322.405,75 MiB) (314,85 GiB)
I 14:37:57 Destination File System: NTFS
I 14:37:57 File Splitting: Auto
I 14:37:59 Read Speed - Effective: 5x - 12,1x
I 14:38:02 Reading Session 1 of 1... (1 Track, LBA: 0 - 1649279)
I 14:38:02 Reading Track 1 of 1... (MODE1/2048, LBA: 0 - 1649279)
I 14:43:30 Image MD5: 92955210a762c83ebd9dff7eda538dc3
I 14:43:30 Exporting Graph Data...
I 14:43:30 Graph Data File: C:\Users\NekhayenkoO\AppData\Roaming\ImgBurn\Graph Data Files\ASUS_BW-16D1HT_1.01_MITTWOCH-14-JUNI-2017_14-37_MCC-003-00.ibg
I 14:43:30 Export Successfully Completed!
I 14:43:30 Operation Failed! - Duration: 00:05:15
I 14:43:30 Average Read Rate: 10.471 KiB/s (7.7x) - Maximum Read Rate: 14.410 KiB/s (10.7x)
I 14:44:19 Operation Started!
I 14:44:19 Source Device: [0:0:0] ASUS BW-16D1HT 1.01 (E:) (USB)
I 14:44:19 Source Media Type: CD-ROM
I 14:44:19 Source Media Supported Read Speeds: 4x; 8x; 10x; 16x; 24x; 32x; 40x; 48x
I 14:44:19 Source Media Supported Write Speeds: 48x
I 14:44:19 Source Media Sectors: 110.527
I 14:44:19 Source Media Size: 226.359.296 bytes
I 14:44:19 Source Media Volume Identifier: SAMPE36
I 14:44:19 Source Media Volume Set Identifier: NOT_SET
I 14:44:19 Source Media Application Identifier: TOAST ISO 9660 BUILDER COPYRIGHT (C) 1997-2002 ROXIO, INC. - HAVE A NICE DAY
I 14:44:19 Source Media File System(s): ISO9660; Joliet
I 14:44:19 Read Speed (Data/Audio): MAX / 40x
I 14:44:19 Destination File: C:\Users\NekhayenkoO\Desktop\LOG Dateien CD Imaging\SAMPE36.ISO
I 14:44:19 Destination Free Space: 334.689.107.968 Bytes (326.844.832,00 KiB) (319.184,41 MiB) (311,70 GiB)
I 14:44:19 Destination File System: NTFS
I 14:44:19 File Splitting: Auto
I 14:44:21 Read Speed - Effective: 48x
I 14:44:25 Reading Session 1 of 1... (1 Track, LBA: 0 - 110526)
I 14:44:25 Reading Track 1 of 1... (MODE1/2048, LBA: 0 - 110526)
I 14:45:42 Image MD5: 960268d9b234ed2a90b8ec9800d717da
I 14:45:42 Exporting Graph Data...
I 14:45:42 Graph Data File: C:\Users\NekhayenkoO\AppData\Roaming\ImgBurn\Graph Data Files\ASUS_BW-16D1HT_1.01_MITTWOCH-14-JUNI-2017_14-44_N-A.ibg
I 14:45:42 Export Successfully Completed!
I 14:45:42 Operation Successfully Completed! - Duration: 00:01:05
I 14:45:42 Average Read Rate: 3.400 KiB/s (19.7x) - Maximum Read Rate: 4.576 KiB/s (26.6x)

FINDFINDSTR 是为在文件的每一行中搜索字符串而编写的,并输出包含找到的整行细绳。多行搜索是不可能的,FINDSTR 的正则表达式支持非常有限。 运行 在命令提示符 window find /?findstr /? 中获取有关这两个命令的帮助。

但是,可以使用带有字符串替换和字符串比较的 FOR 循环来处理日志文件。

ImgBurn.log的示例行:

14:25:32 Destination File: C:\Users\NekhayenkoO\Desktop\LOG Dateien CD Imaging\SME98.ISO
14:27:02 Average Read Rate: 3.573 KiB/s (20.7x) - Maximum Read Rate: 4.812 KiB/s (27.9x)

14:33:32 Destination File: C:\Users\NekhayenkoO\Desktop\LOG Dateien CD Imaging\SME99.ISO
14:34:43 Operation failed! - Duration: 00:01:05
14:37:02 Average Read Rate: 3.573 KiB/s (20.7x) - Maximum Read Rate: 4.812 KiB/s (27.9x)

此日志文件的所需输出:

Success on: C:\Users\NekhayenkoO\Desktop\LOG Dateien CD Imaging\SME98.ISO
Error on:   C:\Users\NekhayenkoO\Desktop\LOG Dateien CD Imaging\SME99.ISO

在处理 ImgBurn.log:

行时获得此输出的批处理代码
@echo off
if not exist "ImgBurn.log" goto :EOF

setlocal EnableExtensions EnableDelayedExpansion
set "Searching=0"
set "IsoFileName="

for /F "usebackq delims=" %%L in ("ImgBurn.log") do (
    set "Line=%%L"
    if !Searching! == 0 (
        if /I "!Line:~-4!" == ".iso" (
            set "Searching=1"
            for /F "tokens=3*" %%I in ("!Line!") do set "IsoFileName=%%J"
        )
    ) else (
        if not "!Line:Average Read Rate=!" == "!Line!" (
            echo Success on: !IsoFileName!
            set "Searching=0"
        ) else if not "!Line:Operation failed=!" == "!Line!" (
            echo Error on:   !IsoFileName!
            set "IsoFileName="
            set "Searching=0"
        )
    )
)
endlocal

FOR 命令逐行读取指定的日志文件并进一步处理所有非空行和所有不以 ; 开头的行,这是 ; 的默认设置=19=](行尾)选项。

每行都完全分配给循环变量 L 因为使用了选项 delims=,它关闭了使用 space 和水平制表符作为分隔符的默认拆分读取行的标记.

该行已分配给环境变量Line

注意: 在命令行 set "Line=%%L" 从文件读取的行中的单个感叹号被 Windows 命令解释器删除,因为延迟扩展被启用。如果一行中有两个感叹号,则字符串之间的字符串将被解释为环境变量名称被替换为该环境变量名称的值,或者在没有环境变量存在且名称匹配不区分大小写的情况下什么都没有一行中的两个感叹号。但只要 ISO 文件名及其路径不包含 1 个或多个感叹号,所有这些都无关紧要。

接下来使用 IF 条件来确定当前搜索 Average Read RateOperation Failed 是否处于活动状态或是否应执行 ISO 文件名搜索因为 Searching 的值是 0.

当行包含 ISO 文件名时,最后 4 个字符不区分大小写等于 .iso 在搜索 ISO 文件名时。

如果当前行满足此条件,则 Searching 切换为 1 并且使用另一个 FOR 从当前行确定具有完整路径的 ISO 文件名循环。

这个内部 FOR 循环现在处理环境变量 Linestring 而不是文件,因为没有 usebackq 选项像在外部 FOR 循环中一样使用。行字符串被拆分为 4 个标记 时间字符串DestinationFile: 行的其余部分使用默认分隔符space/tab,其中第三个标记File:被分配给循环变量I 行的其余部分 到下一个循环变量 J 根据 ASCII table 因为选项 tokens=3*.

一旦从一行中读取 ISO 文件名,搜索就会发生变化。

现在,批处理代码将环境变量 Line 的字符串值中所有出现的 Average Read Rate 不区分大小写替换为空字符串,并将结果与​​未修改的行字符串进行比较。如果当前行包含 Average Read Rate,则此字符串比较不相等,这意味着 Searching 现在再次切换到值 0 并且图像刻录成功并相应地输出。

但是如果当前行不包含Average Read Rate,则检查当前行是否包含不区分大小写的Operation failed。如果是这种情况,因为带有字符串替换的行不等于未修改的行,则图像刻录不成功,相应地输出。然后 Searching 再次切换到值 0 以在日志文件中搜索行尾的下一个 ISO 文件名。

对于使用 Edit 1 发布的日志行格式,内部 FOR 循环的选项字符串 tokens=3* 必须是更改为 tokens=4* 因为与最初发布的日志行相比,每行的开头有一个额外的 space/tab 分隔字符串。

编辑 1 日志文件内容的这个小修改的输出:

Success on: C:\Users\NekhayenkoO\Desktop\LOG Dateien CD Imaging\DGZfP BB66.iso
Success on: C:\Users\NekhayenkoO\Desktop\LOG Dateien CD Imaging\SME99.ISO
Error on:   C:\Users\NekhayenkoO\Desktop\LOG Dateien CD Imaging\Biomasse.iso
Success on: C:\Users\NekhayenkoO\Desktop\LOG Dateien CD Imaging\SAMPE36.ISO

要了解使用的命令及其工作原理,请打开命令提示符 window,在其中执行以下命令,并仔细阅读为每个命令显示的所有帮助页面。

  • echo /?
  • endlocal /?
  • for /?
  • goto /?
  • if /?
  • set /?
  • setlocal /?

批处理文件以糟糕的正则表达式支持而著称。

如果您只想搜索没有周围环境(没有正则表达式)的字符串,这很简单:

@echo off
SET /p file=<ImgBurn.log

findstr /m "Operation failed^!" ImgBurn.log > nul
IF %errorlevel% EQU 0 (
    ECHO "FOUND^!"
) ELSE (
    ECHO "NOT FOUND^!"
)
PAUSE

如果您坚持使用正则表达式,我会转向通过 .bat 文件启动的 PowerShell:

批处理文件如下所示:

@ECHO OFF
powershell.exe -noninteractive -NoProfile -ExecutionPolicy Bypass -Command "& {.\find_string_in_file.ps1};"
PAUSE

执行正则表达式的 powershell 文件 (find_string_in_file.ps1):

$file = Get-Content C:\prg\PowerShell\_Snippets\Inputfile\ImgBurn.log
$search_for = [regex]"(?!\.ISO)Operation failed!(?!Average Read Rate)"
$found = $search_for.Match($file) 

If ($found.Captures[0].value -eq 'Operation failed!') {
    Write-Host "FOUND!"
} Else {
    Write-Host "NOT FOUND!"
}

编辑: 如果你想获得多个结果,你可以通过以下方式(在 $found = $search_for.Match($file) 之后):

while ($found.Success) {
    $resultslist.Add($found.Value) | out-null
    $found = $found.NextMatch()
 } 

然后你就用foreach打印输出。

编辑2: 我已经根据您的评论准备了一个示例 ImgBurn.log 文件:

14:35:32 Destination File: C:\Users\NekhayenkoO\Desktop\LOG Dateien CD Imaging\SME98.ISO

14:34:43 Operation failed! - Duration: 00:01:05
14:37:02 Average Read Rate: 3.573 KiB/s (20.7x) - Maximum Read Rate: 4.812 KiB/s (27.9x)

14:35:32 Destination File: C:\Users\NekhayenkoO\Desktop\LOG Dateien CD Imaging\SME99.ISO

14:34:43 Operation failed! - Duration: 00:01:05
14:37:02 Average Read Rate: 3.573 KiB/s (20.7x) - Maximum Read Rate: 4.812 KiB/s (27.9x)

14:35:32 Destination File: C:\Users\NekhayenkoO\Desktop\LOG Dateien CD Imaging\SME100.ISO

14:34:43 Operation Successful - Duration: 00:01:05
14:37:02 Average Read Rate: 3.573 KiB/s (20.7x) - Maximum Read Rate: 4.812 KiB/s (27.9x)

14:35:32 Destination File: C:\Users\NekhayenkoO\Desktop\LOG Dateien CD Imaging\SME101.ISO

14:34:43 Operation failed! - Duration: 00:01:05
14:37:02 Average Read Rate: 3.573 KiB/s (20.7x) - Maximum Read Rate: 4.812 KiB/s (27.9x)

14:35:32 Destination File: C:\Users\NekhayenkoO\Desktop\LOG Dateien CD Imaging\SME102.ISO

14:34:43 Operation Successful - Duration: 00:01:05
14:37:02 Average Read Rate: 3.573 KiB/s (20.7x) - Maximum Read Rate: 4.812 KiB/s (27.9x)

我已承诺提供完整的解决方案(批处理文件保持不变)。在这里:

$file = Get-Content C:\prg\PowerShell\_Snippets\Inputfile\file.log
# Matching regex
$search_for = [regex]"(?!\.ISO)Operation failed!(?!Average Read Rate)"
$found = $search_for.Match($file)

# initialize ArrayList
[System.Collections.ArrayList]$list = @()
while ($found.Success) {
    # out-null is there for output to be cancled, otherwise outputs the number of ArrayList element (e.g. 0, 1, 2, etc.)
    $list.Add($found.value) | out-null
    $found = $found.NextMatch()
}

# reading the ArrayList
foreach ($found_value in $list) {
    Write-Host "Found: $found_value"
}

然而,这个解决方案在我看来这不是那么有用。

根据正则表达式获取失败的文件名

您可能正在搜索失败的文件。我准备了一个修改,向您显示失败的文件:

# Do not forget the -Raw switch.  Without it it will not match correctly
# Raw siwtch forces powershell to read a text file as a single line of text, 
# not as an array of strings  created by end-of-line returns.
$file = Get-Content C:\prg\PowerShell\_Snippets\Inputfile\file.log -Raw

# Matching regex
$search_for = [regex]"\w+\.ISO(?=(\r|\n)*.*Operation failed)"
$found = $search_for.Match($file)

# initialize ArrayList
[System.Collections.ArrayList]$list = @()
while ($found.Success) {
    # out-null is there for output to be cancled, otherwise outputs the number of ArrayList element (e.g. 0, 1, 2, etc.)
    $list.Add($found.value) | out-null
    $found = $found.NextMatch()
}

# reading the ArrayList
foreach ($found_value in $list) {
    Write-Host "Operation failed at file: $found_value"
}

下面的批处理文件完全按照您的要求执行,即“借助 findstr 和正则表达式检查 "Operation failed" 是否出现在单词 .ISO 和 "Average Read Rate" 之间在日志文件中,包括这三行之间的额外空行。

@echo off
setlocal EnableDelayedExpansion

rem Define LF variable containing a LineFeed
set LF=^
%Empty line 1/2, don't remove%
%Empty line 2/2, don't remove%

rem Define CR variable containing a CarriageReturn
for /F %%a in ('copy /Z "%~F0" nul') do set "CR=%%a"

findstr /R /C:"\.ISO[!CR!!LF!]*.*Operation failed.*[!CR!!LF!]*.*Average Read Rate" test.txt > NUL
if errorlevel 1 (echo Not found) else echo STRING FOUND

在 Windows 8.1

上测试