删除嵌套的 bbcode 样式标签和其中的任何内容

Remove nested bbcode style tags and anything inside them

我需要有关正则表达式的帮助来删除某些内容。我无法让它按我想要的方式工作。

假设我有这段文字:

[quote=test]
[quote=test]for sure[/quote]
Test
[/quote]

[this should not be removed]
Dont remove me

如何删除 [this should not be removed] 以上的所有内容?请注意 Test 可以是任何内容。

所以我想删除里面的任何东西:

[quote=*][/quote]

我已经走到这一步了:

preg_replace('#\[quote=(.+)](.+)\[/quote]#Usi', '', $message);

但它保持:Test [/quote]

匹配嵌套的 bbcode 样式代码相当复杂 - 通常涉及基于非正则表达式的字符串解析器。

似乎您正在使用 PHP 它确实支持 "recursion" 的正则表达式 (?R) 语法,使用它我们可以支持这样的嵌套 bbcode。

请注意,不匹配的开盘 [quote=*] 和收盘 [/quote] 对将不会被匹配。

正则表达式

\[(quote)=[^]]+\](?>(?R)|.)*?\[/quote]

https://regex101.com/r/xF3oR6/1

代码

$result = preg_replace('%\[(quote)=[^]]+\](?>(?R)|.)*?\[/quote]%si', '', $subject);

人类可读

# \[(quote)=[^]]+\](?>(?R)|.)*?\[/quote]
# 
# Options: Case insensitive; Exact spacing; Dot matches line breaks; ^$ don’t match at line breaks; Greedy quantifiers; Regex syntax only
# 
# Match the character “[” literally «\[»
# Match the regex below and capture its match into backreference number 1 «(quote)»
#    Match the character string “quote” literally (case insensitive) «quote»
# Match the character “=” literally «=»
# Match any character that is NOT a “]” «[^]]+»
#    Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
# Match the character “]” literally «\]»
# Match the regular expression below; do not try further permutations of this group if the overall regex fails (atomic group) «(?>(?R)|.)*?»
#    Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
#    Match this alternative (attempting the next alternative only if this one fails) «(?R)»
#       Match the entire regular expression (recursion; restore capturing groups upon exit; do not try further permutations of the recursion if the overall regex fails) «(?R)»
#    Or match this alternative (the entire group fails if this one fails to match) «.»
#       Match any single character «.»
# Match the character “[” literally «\[»
# Match the character string “/quote]” literally (case insensitive) «/quote]»