批处理文件以从文件中提取文本

Batch file to extract text from file

我有一个日志文件,我正试图从中提取特定的行。当我将文件裁剪成上下几行时,我就能得到它。但是,我试图找到阻止使用完整文件的多个实例。

以下是我试过的一些代码...

 for /f "tokens=1* delims=[]" %%a in ('find /n "    <Line Text="***********TEST1  TEST  TEST************" />" ^< TEST.LOG') do (set H=%%a
 )

 for /f "tokens=1* delims=[]" %%a in ('find /n "</Report>" ^< TEST.LOG') do (
 set T=%%a
 )

 for /f "tokens=1* delims=[]" %%a in ('find /n /v "" ^< TEST.LOG') do (
 if %%a GEQ !H! if %%a LEQ !T! echo.%%b
 )>> newfile.txt

我希望得到以下信息:

 <Line Text="***********TEST1  TEST  TEST************" />
 ~ALL LINES IN BETWEEN~
 </Report>

Windows 专为执行命令和可执行文件而不是文本文件处理而设计的命令处理器绝对是过滤的最差选择 TEST.LOG。出于完全阅读我在 上的回答的原因,那里详细描述的批处理文件代码被用作下面的批处理文件代码的模板:

@echo off
setlocal EnableExtensions DisableDelayedExpansion
if not exist "Test.log" goto EndBatch
set "OutputLines="

(for /F delims^=^ eol^= %%I in ('%SystemRoot%\System32\findstr.exe /N "^" "Test.log"') do (
    set "Line=%%I"
    setlocal EnableDelayedExpansion
    if defined OutputLines (
        echo(!Line:*:=!
        if not "!Line:</Report>=!" == "!Line!" (
            endlocal & set "OutputLines="
        ) else endlocal
    ) else if not "!Line:<Line Text=!" == "!Line!" (
        echo(!Line:*:=!
        endlocal & set "OutputLines=1"
    ) else endlocal
))>"newfile.txt"

if exist "newfile.txt" for %%I in ("newfile.txt") do if %%~zI == 0 del "newfile.txt"

:EndBatch
endlocal

此批处理文件将所有行从包含不区分大小写的字符串 <Line Text 的行写入包含不区分大小写的字符串 </Report> 的行或文件结尾 Test.log 到文件 newfile.txt.

注意: !Line:= 之间的搜索字符串不能包含等号,因为等号由 Windows 解释命令处理器作为搜索字符串(此处为 </Report><Line Text)与替换字符串(此处两次为空字符串)之间的分隔符。搜索字符串开头的星号 * 被 Windows 命令处理器解释为替换从行首到找到的字符串第一次出现的所有内容,而不是在进行字符串替换时查找的字符线。但这对于这个用例并不重要。

如果标记要提取的块的开头和结尾的两行是固定的并且不包含任何可变部分,则可以在不进行字符串替换的情况下完成两个字符串比较,从而可以比较包含等号的字符串。

@echo off
setlocal EnableExtensions DisableDelayedExpansion
if not exist "Test.log" goto EndBatch

set "BlockBegin= <Line Text="***********TEST1  TEST  TEST************" />"
set "BlockEnd= </Report>"
set "OutputLines="

(for /F delims^=^ eol^= %%I in ('%SystemRoot%\System32\findstr.exe /N "^" "Test.log"') do (
    set "Line=%%I"
    setlocal EnableDelayedExpansion
    if defined OutputLines (
        echo(!Line:*:=!
        if "!Line:*:=!" == "!BlockEnd!" (
            endlocal & set "OutputLines="
        ) else endlocal
    ) else if "!Line:*:=!" == "!BlockBegin!" (
        echo(!Line:*:=!
        endlocal & set "OutputLines=1"
    ) else endlocal
))>"newfile.txt"

if exist "newfile.txt" for %%I in ("newfile.txt") do if %%~zI == 0 del "newfile.txt"

:EndBatch
endlocal

此变体将区分大小写的每一行与分配给环境变量 BlockBeginBlockEnd 的字符串进行比较,以确定从哪一行开始以及从哪一行停止输出行。

要了解使用的命令及其工作原理,请打开命令提示符 window,在其中执行以下命令,并仔细阅读为每个命令显示的所有帮助页面。

  • del /?
  • echo /?
  • endlocal /?
  • findstr /?
  • for /?
  • goto /?
  • if /?
  • set /?
  • setlocal /?

另请参阅:

  • Single line with multiple commands using Windows batch file
  • 了解有关命令 SETLOCALENDLOCAL 的详细信息,它们在循环中需要并且负责非常低效的处理由于大量内存复制,这些行是在后台额外完成的。

您可以尝试使用此代码:

@echo off
Title Extract Data between two tags
Set "InputFile=InputFile.txt"
Set From_Start="<Line"
Set To_End="</Report>"
Set "OutputFile=OutputFile.txt"
Call :ExtractData %InputFile% %From_Start% %To_End%
Call :ExtractData %InputFile% %From_Start% %To_End%>%OutputFile%
If Exist %OutputFile% Start "" %OutputFile%
Exit
::'*************************************************************
:ExtractData <InputFile> <From_Start> <To_End>
(
echo Set fso = CreateObject^("Scripting.FileSystemObject"^)
echo Set f=fso.opentextfile^("%~1",1^)
echo Data = f.ReadAll
echo Data = Extract(Data,"(%~2.*\r\n)([\w\W]*)(\r\n)(%~3)"^)
echo WScript.StdOut.WriteLine Data
echo '************************************************
echo Function Extract(Data,Pattern^)
echo    Dim oRE,oMatches,Match,Line
echo    set oRE = New RegExp
echo    oRE.IgnoreCase = True
echo    oRE.Global = True
echo    oRE.Pattern = Pattern
echo    set Matches = oRE.Execute(Data^)
echo    If Matches.Count ^> 0 Then Data = Matches^(0^).SubMatches^(1^)
echo    Extract = Data
echo End Function
echo '************************************************
)>"%tmp%\%~n0.vbs"
cscript //nologo "%tmp%\%~n0.vbs"
If Exist "%tmp%\%~n0.vbs" Del "%tmp%\%~n0.vbs"
exit /b
::****************************************************

已更新:

You want to find <Line Text="***********TEST1 TEST TEST************" /> then print it and any line until the First </Report> is encountered, then look for the next <Line Text="***********TEST1 TEST TEST************" /> and print it and every following line until the next </Report> for every time it occurs throughout?

– Ben Personick 1 小时前

OR do you just want to take from the first <Line Text="***********TEST1 TEST TEST************" /> to the first </Report>?

– Ben Personick 1 小时前

Find <Line Text="***********TEST1 TEST TEST************" /> then print it and any line until the First </Report> is encountered, then look fo rthe next <Line Text="***********TEST1 TEST TEST************" /> and print it and every followinng line until the next </Report> for every time it occurs. I feel there should ONLY be 1 sequence, however, at times this situation could very well be possible. Thanks for asking, very solid question!

– 吹牛老爹 1 小时前

好的,这应该会按照您预期的方式工作,但是,如果有很多意想不到的字符,修改行的输出方式可能更有意义使用 SET 而不是 echo.

@(setlocal
  ECHO OFF
  SET "_LogFile=C:\Admin\TestLog.log"
  SET "_ResultFile=C:\Admin\TestLog.txt"
  SET "_MatchString_Begin=<Line Text="***********AAAAA BBBB CCCC************" />"
  SET "_MatchString_End=</Report>"
  SET "_Line#_Begin="
)

CALL :Main

( ENDLOCAL
  EXIT/B
)
:Main
  IF EXIST "%_ResultFile%" (
    DEL /F /Q "%_ResultFile%"
  )
  ECHO.&ECHO.== Processing ==&ECHO.
  FOR /F "Delims=[]" %%# IN ('
    Find /N "%_MatchString_Begin:"=""%" "%_LogFile%" ^| FIND "["
  ') DO (
    ECHO. Found Match On Line %%#
    SET /A "_Line#_Begin=%%#-1"
    CALL :Output
  )
  ECHO.&ECHO.== Completed ==&ECHO.&ECHO.Results to Screen will Start in 5 Seconds:
  timeout 5
  Type "%_ResultFile%"
GOTO :EOF

:Output
  FOR /F "SKIP=%_Line#_Begin% Tokens=* usebackq" %%_ IN (
    "%_LogFile%"
  ) DO (
    ECHO(%%_
    ECHO("%%_" | FIND /I "%_MatchString_End%" >NUL&&(
      GOTO :EOF
    )
  )>>"%_ResultFile%"
GOTO :EOF

原始响应仅显示第一个匹配的内容,基于此评论:

This works great with my "cropped" file. However, in the ORIGINAL, ONLY unique line I have is <Line Text="***********AAAAA BBBB CCCC************" />. I can't seem to be able to use the full line as my batch just exits out, but am able to input "***********AAAAA BBBB CCCC************" and does not kick my batch out, however, exists elsewhere. Thus, requiring the other parameters as it is unique within the file. and I want the next following: in sequence. Otherwise this "</Report>" exists above in another section I don't want and believe is causing issue. – T-Diddy 3 mins ago

好吧,我也是这么想的。

试试这个:

@(setlocal
  ECHO OFF
  SET "_LogFile=C:\Admin\TestLog.log"
  SET "_MatchString_Begin=<Line Text="***********AAAAA BBBB CCCC************" />"
  SET "_MatchString_End=</Report>"
  SET "_Line#_Begin="
  SET "_Line#_End="
)
REM SET
FOR /F "Delims=[]" %%# IN ('
  Find /N "%_MatchString_Begin:"=""%" "%_LogFile%" ^| FIND "["
') DO (
  IF NOT DEFINED _Line#_Begin (
    SET /A "_Line#_Begin=%%#-1"
    ECHO.SET /A "_Line#_Begin=%%#-1"
  )
)
FOR /F "SKIP=%_Line#_Begin% Tokens=* usebackq" %%_ IN (
  "%_LogFile%"
) DO (
  IF NOT DEFINED _Line#_End (
    ECHO(%%_
    ECHO("%%_" | FIND /I "%_MatchString_End%" &&(
      SET "_Line#_End=1"
    )
  )
)
PAUSE