正则表达式 - 获取第一部分和最后一部分

Regexp - Get the first and last part

有没有办法获取下面几行的第一部分和最后一部分? 我猜正则表达式是要走的路。最好用notepad++

这不需要超级优化或任何东西。它偶尔会手动执行,如果 运行 几分钟,那很好。

如果在同一个正则表达式中同时处理 'TimeingLog' 和 'HandleArticleWarningOnOrder' 行是个大问题,我可以 运行 两个不同的正则表达式并合并结果。

我首先使用这个正则表达式来查找这些行,它们来自一个更大的列表,其中有很多我不感兴趣的行。 ^.{26}(HandleArticleWarningOnOrder -> -1.*|Timinglog.*)

请注意,线条可以比下面的示例更长或更短

输入

2022-01-11 09:52:35.65 -> TimingLog -> 1: '69' -2: '434' -3: '434' -4: '434' -5: '509' -6: '509' -6.1: '509' -7: '588' -19: '588' -20: '588' -21: '5145' -22: '5202' -23: '5224' -24: '5233' -25: '5233'
2022-01-11 09:52:48.82 -> TimingLog -> 1: '47' -2: '213' -3: '213' -4: '213' -5: '269' -6: '269' -6.1: '269' -7: '298' -8: '298' -12: '380' -13: '380' -14: '6270' -15: '6328' -16: '6347' -17: '6356' -18: '6356'
2022-01-11 09:53:02.68 -> TimingLog -> 1: '23' -2: '54' -3: '54' -4: '54' -5: '65' -6: '65' -6.1: '65' -7: '76' -19: '76' -20: '76' -21: '4916' -22: '4982' -23: '5010' -24: '5015' -25: '5015'
2022-01-11 09:53:06.57 -> HandleArticleWarningOnOrder ->  -1: '160' -2: '223' -1: '223' -2: '285' -1: '285' -2: '671' -1: '671' -2: '816' -1: '816' -2: '970' -1: '970' -2: '1122' -3: '1122' -4: '1312' -5: '17766' -6: '17766'
2022-01-11 09:53:17.01 -> TimingLog -> 1: '140' -2: '527' -3: '527' -4: '527' -5: '671' -6: '671' -6.1: '671' -7: '737' -19: '737' -20: '737' -21: '5984' -22: '6163' -23: '6307' -24: '6339' -25: '6339'
2022-01-11 09:53:25.12 -> TimingLog -> 1: '25' -2: '85' -3: '85' -4: '85' -5: '108' -6: '108' -6.1: '108' -7: '117' -19: '117' -20: '117' -21: '7706' -22: '7880' -23: '8018' -24: '8110' -25: '8110'
2022-01-11 09:53:31.90 -> TimingLog -> 1: '51' -2: '210' -3: '210' -4: '210' -5: '269' -6: '269' -6.1: '269' -7: '324' -19: '324' -20: '324' -21: '6641' -22: '6675' -23: '6704' -24: '6711' -25: '6711'
2022-01-11 09:53:44.04 -> TimingLog -> 1: '27' -2: '121' -3: '121' -4: '121' -5: '202' -6: '202' -6.1: '202' -7: '215' -19: '215' -20: '215' -21: '6520' -22: '6566' -23: '6594' -24: '6604' -25: '6604'
2022-01-11 09:53:53.51 -> TimingLog -> 1: '72' -2: '275' -3: '275' -4: '275' -5: '302' -6: '302' -6.1: '302' -7: '327' -8: '327' -12: '413' -13: '413' -14: '7408' -15: '7571' -16: '7725' -17: '7731' -18: '7731'
2022-01-11 09:54:04.27 -> TimingLog -> 1: '22' -2: '72' -3: '72' -4: '72' -5: '86' -6: '86' -6.1: '86' -7: '105' -8: '105' -12: '147' -13: '147' -14: '5192' -15: '5223' -16: '5251' -17: '5269' -18: '5269'
2022-01-11 09:54:09.16 -> HandleArticleWarningOnOrder ->  -1: '91' -2: '188' -2.1: '188' -3: '188' -4: '351' -5: '18276' -6: '18276'
2022-01-11 09:54:12.80 -> TimingLog -> 1: '13' -2: '43' -3: '43' -4: '43' -5: '51' -6: '51' -6.1: '51' -7: '57' -8: '57' -12: '86' -13: '86' -14: '8024' -15: '8263' -16: '8430' -17: '8524' -18: '8524'
2022-01-11 09:54:21.30 -> TimingLog -> 1: '105' -2: '353' -3: '353' -4: '353' -5: '414' -6: '414' -6.1: '414' -7: '470' -8: '470' -12: '814' -13: '814' -14: '8172' -15: '8336' -16: '8449' -17: '8480' -18: '8480'
2022-01-11 09:54:34.02 -> HandleArticleWarningOnOrder ->  -1: '102' -2: '154' -2.1: '154' -3: '154' -4: '202' -5: '20106' -6: '20106'
...

首选输出

2022-01-11 09:52:35.65 -> TimingLog -> '5233'
2022-01-11 09:52:48.82 -> TimingLog -> '6356'
2022-01-11 09:53:02.68 -> TimingLog -> '5015'
2022-01-11 09:53:06.57 -> HandleArticleWarningOnOrder -> '17766'
2022-01-11 09:53:17.01 -> TimingLog -> '6339'
...

您可以为您的特定匹配使用带有分支重置组的替换:

^(.{26})(?|(HandleArticleWarningOnOrder ->)\h{2,}-1\b|(TimingLog ->)).*('\d+')

模式匹配:

  • ^ 字符串开头
  • (.{26})捕获第1组,匹配26个字符(你可以考虑让这个模式更具体一点)
  • (?|分支重置组
    • (HandleArticleWarningOnOrder ->)\h{2,}-1\b 捕获 组 2 中的文本,然后匹配 2 个或更多空格和 -1 以及单词边界以防止部分单词匹配
    • |
    • (TimingLog ->) 捕获第2组,字面匹配
  • )关闭分支重置组
  • .* 匹配行的其余部分
  • ('\d+') 捕获 组 3
  • 中单引号之间最后一次出现的 1+ 位数字

Regex demo

在替换中使用捕获组 1、2 和 3,如

也许使用 \K 和单个捕获组的更简单的模式:

^.{26}(?:HandleArticleWarningOnOrder ->(?=\h{2,}-1\b)|TimingLog ->)\K.*('\d+')

在替换中使用

Regex demo

您可以使用此正则表达式替换字符串。

^(.*->)( *(-)?\d*(\.)?\d*: ('\d{1,}\'))*$

Regex Demo

并替换为 $1$5,捕获第 1 组和第 5 组。

这是给定示例的简短代码...

查找:^(\d.*->).*('\d+')
全部替换: