从变量中未知数量的定界符中提取某些标记的批处理文件

Batch file extracting certain tokens from unknown number of delimiters in variable

第三,希望是对问题的最后修改...

批处理文件用 for 循环逐行读取文本文件到变量中。所述文本文件的每一行的格式都可以与下一行完全不同。唯一常见的分隔符是每行某处的四位数(年份)。目标是 return 通过 echo 将任何文本跟在上述每行的四位数字后面。

文本文件示例:

Monday, January 1, 1900 there was an event-e6718
On this day in 1904 nothing occurred
Wednesday, March 3, 1908 an error occurred when attempting to access the log
Thursday, , 1911 - access denied
Friday, in whatever month, on whatever day, in 1938, nothing happened

因此,根据上面的文本文件示例,return 就像...

there was an event-e6718
nothing occurred
an error occurred when attempting to access the log
- access denied
nothing happened

截至 1318 PST,我已经尝试了以下评论中的每个代码片段,但其中 none 能够 return 我需要 return 的数据。

但是,这些评论与我最初的问题相关,该问题已得到显着改进。

我什至试过正则表达式“^[1-9][0-9][0-9][0-9]$”,然而我是正则表达式的新手,所以我确定我错了。

这可能吗?

提前致谢。

如果字符串中确实没有共同点或令牌计数不一致,请调整以下 for 循环中的迭代次数以匹配最大可能的令牌。

@Echo off & Setlocal EnableDelayedexpansion
  Set "event=Monday, January 1, 1900 there was an event-e6718"
  For /L %%i in (1 1 10) Do (
   Set "event=!event:*, =!"
  )& rem // arbitrary number of iterations that should be adjusted to match the maximum expected tokens
  Set "event=%event:~5,100%"& rem // remove the year[space] from the string - final string maximum length is also arbitrary and may need adjusting.
  Echo/%event%

** 更新** 使用上述方法的宏版本获取 for 循环中最后一个标记的宏示例:

注意:您需要调整输入文件的文件路径。

@Echo off

(Set \n=^^^
%=Newline Var=%
)

  Set Gettoken=For %%n in (1 2) Do if %%n==2 (%\n%
    For /F "Delims=" %%G in ("!string!") Do (%\n%
    Set "output=%%G"%\n%
    For %%D in ("!Delim!") Do For /L %%i in (1 1 10) Do Set "output=!output:*%%~D=!"%\n%
    Set "output=!output:~5,100!"%\n%
  )%\n%
  Set output%\n%
) Else Set string=
Setlocal EnableDelayedExpansion
 Set "Delim=, "&& For /F "Delims=" %%I in (inputfile.txt) Do %GetToken%%%I

编辑:

批改题实际要求的解决方案。

@Echo off & CD "%~dp0"
Setlocal Enabledelayedexpansion
rem // replace inputfile.txt with your filepath
For /F "Delims=" %%L in (inputfile.txt) Do (
Call :sub "%%L"
rem // the below for loop will remove everything up to including the first year from the string
rem // as well as traling coma[space] / [space]
For %%E in (!Errorlevel!) Do (
 If Not "%%E"=="0" (
  Set "String=!String:*%%E=####!"
  Set "String=!String:####, =!"
  Set "String=!String:#### =!"
  Set "String=!String:####=!"
 )
)
rem // output only if a year "delimiter" was encountered
If not "%%~L"=="!String!" Echo/!String!
)

Exit /b
:sub
Set "String=%~1"
rem // adjust for loop %%I for valid year range and %%# for maximum expected string length
For /L %%I in (1899 1 2050) Do (For /L %%# in (0 1 100) Do (If "!String:~%%#,4!"=="%%I" (Exit /B %%I)))
Exit /B 0

试一试:

@echo off & setlocal enabledelayedexpansion
for /f "delims=" %%i in ('type "C:\textfile.txt" ^| findstr /IRC:"there was an event"') do (
   set "event=%%i"
   echo "!event:*there was an event=there was an event!"
)

textfile.txt

Monday, January 1, 1900 there was an event-e6718
On this day in 1904 nothing occurred
Wednesday, March 3, 1908 an error occurred when attempting to access the log
Thursday, , 1911 - access denied
Monday, January 1, 1910 there was an event-dsfd318
Friday, in whatever month, on whatever day, in 1938, nothing happened

结果:

批处理是一项可怕的任务。 REGEX 是一个很好的工具,但 cmd 不支持它(findstr 的一个非常残缺的子集除外)。如果你愿意使用 external tool,这很容易:

<old.txt call jrepl ".*(\d{4})\D\ *(.*$)" "" >new.txt

搜索四位数字 \d{4},然后是 non-digit \D 和零个或多个空格,直到“EndOfLine” .*$。 (括号)标记匹配项,由 $x 引用。您想要的字符串在 </code>.</p> <p>使用您的示例文件输出:</p> <pre><code>there was an event-e6718 nothing occurred an error occurred when attempting to access the log - access denied there was an event-dsfd318 nothing happened

如果您决定包括年份,您可以在 </code>:</p> 中找到它 <pre><code><old.txt call jrepl ".*(\d{4})\D\ *(.*$)" ": " >new.txt

给出:

1900: there was an event-e6718
1904: nothing occurred
1908: an error occurred when attempting to access the log
1911: - access denied
1910: there was an event-dsfd318
1938: nothing happened

call 是批处理文件所必需的,因为 jrepl 是一个批处理文件,因此没有 call 就不会 return。
(REGEX 模式可能需要改进;我还没有太多经验。)

jrepl.batdbenham 编程。