删除嵌套的 bbcode 样式标签和其中的任何内容
Remove nested bbcode style tags and anything inside them
我需要有关正则表达式的帮助来删除某些内容。我无法让它按我想要的方式工作。
假设我有这段文字:
[quote=test]
[quote=test]for sure[/quote]
Test
[/quote]
[this should not be removed]
Dont remove me
如何删除 [this should not be removed]
以上的所有内容?请注意 Test
可以是任何内容。
所以我想删除里面的任何东西:
[quote=*][/quote]
我已经走到这一步了:
preg_replace('#\[quote=(.+)](.+)\[/quote]#Usi', '', $message);
但它保持:Test [/quote]
匹配嵌套的 bbcode 样式代码相当复杂 - 通常涉及基于非正则表达式的字符串解析器。
似乎您正在使用 PHP 它确实支持 "recursion" 的正则表达式 (?R)
语法,使用它我们可以支持这样的嵌套 bbcode。
请注意,不匹配的开盘 [quote=*]
和收盘 [/quote]
对将不会被匹配。
正则表达式
\[(quote)=[^]]+\](?>(?R)|.)*?\[/quote]
https://regex101.com/r/xF3oR6/1
代码
$result = preg_replace('%\[(quote)=[^]]+\](?>(?R)|.)*?\[/quote]%si', '', $subject);
人类可读
# \[(quote)=[^]]+\](?>(?R)|.)*?\[/quote]
#
# Options: Case insensitive; Exact spacing; Dot matches line breaks; ^$ don’t match at line breaks; Greedy quantifiers; Regex syntax only
#
# Match the character “[” literally «\[»
# Match the regex below and capture its match into backreference number 1 «(quote)»
# Match the character string “quote” literally (case insensitive) «quote»
# Match the character “=” literally «=»
# Match any character that is NOT a “]” «[^]]+»
# Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
# Match the character “]” literally «\]»
# Match the regular expression below; do not try further permutations of this group if the overall regex fails (atomic group) «(?>(?R)|.)*?»
# Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
# Match this alternative (attempting the next alternative only if this one fails) «(?R)»
# Match the entire regular expression (recursion; restore capturing groups upon exit; do not try further permutations of the recursion if the overall regex fails) «(?R)»
# Or match this alternative (the entire group fails if this one fails to match) «.»
# Match any single character «.»
# Match the character “[” literally «\[»
# Match the character string “/quote]” literally (case insensitive) «/quote]»
我需要有关正则表达式的帮助来删除某些内容。我无法让它按我想要的方式工作。
假设我有这段文字:
[quote=test]
[quote=test]for sure[/quote]
Test
[/quote]
[this should not be removed]
Dont remove me
如何删除 [this should not be removed]
以上的所有内容?请注意 Test
可以是任何内容。
所以我想删除里面的任何东西:
[quote=*][/quote]
我已经走到这一步了:
preg_replace('#\[quote=(.+)](.+)\[/quote]#Usi', '', $message);
但它保持:Test [/quote]
匹配嵌套的 bbcode 样式代码相当复杂 - 通常涉及基于非正则表达式的字符串解析器。
似乎您正在使用 PHP 它确实支持 "recursion" 的正则表达式 (?R)
语法,使用它我们可以支持这样的嵌套 bbcode。
请注意,不匹配的开盘 [quote=*]
和收盘 [/quote]
对将不会被匹配。
正则表达式
\[(quote)=[^]]+\](?>(?R)|.)*?\[/quote]
https://regex101.com/r/xF3oR6/1
代码
$result = preg_replace('%\[(quote)=[^]]+\](?>(?R)|.)*?\[/quote]%si', '', $subject);
人类可读
# \[(quote)=[^]]+\](?>(?R)|.)*?\[/quote]
#
# Options: Case insensitive; Exact spacing; Dot matches line breaks; ^$ don’t match at line breaks; Greedy quantifiers; Regex syntax only
#
# Match the character “[” literally «\[»
# Match the regex below and capture its match into backreference number 1 «(quote)»
# Match the character string “quote” literally (case insensitive) «quote»
# Match the character “=” literally «=»
# Match any character that is NOT a “]” «[^]]+»
# Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
# Match the character “]” literally «\]»
# Match the regular expression below; do not try further permutations of this group if the overall regex fails (atomic group) «(?>(?R)|.)*?»
# Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
# Match this alternative (attempting the next alternative only if this one fails) «(?R)»
# Match the entire regular expression (recursion; restore capturing groups upon exit; do not try further permutations of the recursion if the overall regex fails) «(?R)»
# Or match this alternative (the entire group fails if this one fails to match) «.»
# Match any single character «.»
# Match the character “[” literally «\[»
# Match the character string “/quote]” literally (case insensitive) «/quote]»