正则表达式超时
Regex timing out
我正在尝试匹配
foo: anything after the colon can be matched with (.*)+
foo.bar1.BAZ: balh5317{}({}(
这是我正在使用的正则表达式:
/^((?:(?:(?:[A-Za-z_]+)(?:[0-9]+)?)+[\.]?)+)(?:\s)?(?:\:)(?:\s)?((?:.*)+)$/
请原谅不匹配的组和额外的括号,这是从构建器编译的class
这适用于示例。当我尝试输入这样的字符串时出现问题:
foo.bar.baz.beef.stew.ect.and.forward
我需要能够像这样检查字符串,但是正则表达式引擎在每次 foo.
一定数量后超时或运行无限(据我所知)。
我确定这是一个我可以解决的逻辑问题,但不幸的是我还远未掌握正则表达式,我希望更有经验的用户可以阐明我如何使它更有效率。
另外,我需要匹配的内容更详细的描述如下:
Property Name: can contain A-z, numbers, and underscores but can't start with a number
<Property Name>.<Property Name>.<Prop...:<Anything after the colon>
感谢您的宝贵时间!
从您的正则表达式开始:
^((?:(?:(?:[A-Za-z_]+)(?:[0-9]+)?)+[\.]?)+)(?:\s)?(?:\:)(?:\s)?((?:.*)+)$
^ # Anchors to the beginning to the string.
( # Opens CG1
(?: # Opens NCG
(?: # Opens NCG
(?: # Opens NCG
[A-Za-z_]+ # Character class (any of the characters within)
) # Closes NCG
(?: # Opens NCG
[0-9]+ # Character class (any of the characters within)
)? # Closes NCG
)+ # Closes NCG
[\.]? # Character class (any of the characters within)
)+ # Closes NCG
) # Closes CG1
(?: # Opens NCG
\s # Token: \s (white space)
)? # Closes NCG
(?: # Opens NCG
\: # Literal :
) # Closes NCG
(?: # Opens NCG
\s # Token: \s (white space)
)? # Closes NCG
( # Opens CG2
(?: # Opens NCG
.* # . denotes any single character, except for newline
)+ # Closes NCG
) # Closes CG2
$ # Anchors to the end to the string.
我将 [0-9]
转换为 \d
,只是为了更容易阅读(两者匹配相同的东西)。我还删除了很多非捕获组,因为它们并没有真正被使用。
^((?:(?:[A-Za-z_]+\d*)+\.?)+)\s?\:\s?((?:.*)+)$
我也将 \s
和 .* 合并到 [\s\S]*
中,但看到它后面跟着一个 +
符号,我删除了该组并制作了 [\s\S]
.
^((?:(?:[A-Za-z_]+\d*)+\.?)+)\s?\:([\s\S]+)$
^
现在我不确定克拉上方的 +
应该做什么。我们可以删除它,从而删除它周围的非捕获组。
^((?:[A-Za-z_]+\d*\.?)+)\s?\:([\s\S]+)$
解释:
^ # Anchors to the beginning to the string.
( # Opens CG1
(?: # Opens NCG
[A-Za-z_]+ # Character class (any of the characters within)
\d* # Token: \d (digit)
\.? # Literal .
)+ # Closes NCG
) # Closes CG1
\s? # Token: \s (white space)
\: # Literal :
( # Opens CG2
[\s\S]+ # Character class (any of the characters within)
) # Closes CG2
$ # Anchors to the end to the string.
现在,如果您要处理多行,您可能希望将 [\s\S]+
改回 .*
。有几种不同的选择,但这取决于您使用的语言。
老实说,我是按步骤做的,但最大的问题是 (?:.*)+
这是告诉引擎 match 0 or more characters 1 or more times
catastrophic backtracking (as xufox linked to in comments).
生成的正则表达式以及您的原始正则表达式允许以 .
结尾的变量=51=]
这将匹配像 foo.ba5r 这样的名称,如果可以的话,您之前的正则表达式不会。
^([A-Za-z_]\w*(?:\.[A-Za-z_]+\w*)*)\s?\:([\s\S]+)$
解释:
^ # Anchors to the beginning to the string.
( # Opens CG1
[A-Za-z_] # Character class (any of the characters within)
\w* # Token: \w (a-z, A-Z, 0-9, _)
(?: # Opens NCG
\. # Literal .
[A-Za-z_] # Character class (any of the characters within)
\w* # Token: \w (a-z, A-Z, 0-9, _)
)* # Closes NCG
) # Closes CG1
\s? # Token: \s (white space)
\: # Literal :
( # Opens CG2
[\s\S]+ # Character class (any of the characters within)
) # Closes CG2
$ # Anchors to the end to the string.
我正在尝试匹配
foo: anything after the colon can be matched with (.*)+
foo.bar1.BAZ: balh5317{}({}(
这是我正在使用的正则表达式:
/^((?:(?:(?:[A-Za-z_]+)(?:[0-9]+)?)+[\.]?)+)(?:\s)?(?:\:)(?:\s)?((?:.*)+)$/
请原谅不匹配的组和额外的括号,这是从构建器编译的class
这适用于示例。当我尝试输入这样的字符串时出现问题:
foo.bar.baz.beef.stew.ect.and.forward
我需要能够像这样检查字符串,但是正则表达式引擎在每次 foo.
一定数量后超时或运行无限(据我所知)。
我确定这是一个我可以解决的逻辑问题,但不幸的是我还远未掌握正则表达式,我希望更有经验的用户可以阐明我如何使它更有效率。
另外,我需要匹配的内容更详细的描述如下:
Property Name: can contain A-z, numbers, and underscores but can't start with a number
<Property Name>.<Property Name>.<Prop...:<Anything after the colon>
感谢您的宝贵时间!
从您的正则表达式开始:
^((?:(?:(?:[A-Za-z_]+)(?:[0-9]+)?)+[\.]?)+)(?:\s)?(?:\:)(?:\s)?((?:.*)+)$
^ # Anchors to the beginning to the string.
( # Opens CG1
(?: # Opens NCG
(?: # Opens NCG
(?: # Opens NCG
[A-Za-z_]+ # Character class (any of the characters within)
) # Closes NCG
(?: # Opens NCG
[0-9]+ # Character class (any of the characters within)
)? # Closes NCG
)+ # Closes NCG
[\.]? # Character class (any of the characters within)
)+ # Closes NCG
) # Closes CG1
(?: # Opens NCG
\s # Token: \s (white space)
)? # Closes NCG
(?: # Opens NCG
\: # Literal :
) # Closes NCG
(?: # Opens NCG
\s # Token: \s (white space)
)? # Closes NCG
( # Opens CG2
(?: # Opens NCG
.* # . denotes any single character, except for newline
)+ # Closes NCG
) # Closes CG2
$ # Anchors to the end to the string.
我将 [0-9]
转换为 \d
,只是为了更容易阅读(两者匹配相同的东西)。我还删除了很多非捕获组,因为它们并没有真正被使用。
^((?:(?:[A-Za-z_]+\d*)+\.?)+)\s?\:\s?((?:.*)+)$
我也将 \s
和 .* 合并到 [\s\S]*
中,但看到它后面跟着一个 +
符号,我删除了该组并制作了 [\s\S]
.
^((?:(?:[A-Za-z_]+\d*)+\.?)+)\s?\:([\s\S]+)$
^
现在我不确定克拉上方的 +
应该做什么。我们可以删除它,从而删除它周围的非捕获组。
^((?:[A-Za-z_]+\d*\.?)+)\s?\:([\s\S]+)$
解释:
^ # Anchors to the beginning to the string.
( # Opens CG1
(?: # Opens NCG
[A-Za-z_]+ # Character class (any of the characters within)
\d* # Token: \d (digit)
\.? # Literal .
)+ # Closes NCG
) # Closes CG1
\s? # Token: \s (white space)
\: # Literal :
( # Opens CG2
[\s\S]+ # Character class (any of the characters within)
) # Closes CG2
$ # Anchors to the end to the string.
现在,如果您要处理多行,您可能希望将 [\s\S]+
改回 .*
。有几种不同的选择,但这取决于您使用的语言。
老实说,我是按步骤做的,但最大的问题是 (?:.*)+
这是告诉引擎 match 0 or more characters 1 or more times
catastrophic backtracking (as xufox linked to in comments).
生成的正则表达式以及您的原始正则表达式允许以 .
结尾的变量=51=]
这将匹配像 foo.ba5r 这样的名称,如果可以的话,您之前的正则表达式不会。
^([A-Za-z_]\w*(?:\.[A-Za-z_]+\w*)*)\s?\:([\s\S]+)$
解释:
^ # Anchors to the beginning to the string.
( # Opens CG1
[A-Za-z_] # Character class (any of the characters within)
\w* # Token: \w (a-z, A-Z, 0-9, _)
(?: # Opens NCG
\. # Literal .
[A-Za-z_] # Character class (any of the characters within)
\w* # Token: \w (a-z, A-Z, 0-9, _)
)* # Closes NCG
) # Closes CG1
\s? # Token: \s (white space)
\: # Literal :
( # Opens CG2
[\s\S]+ # Character class (any of the characters within)
) # Closes CG2
$ # Anchors to the end to the string.