如何从文本中删除特定用户的所有 bbcode 引用块?
How to remove all bbcode quote blocks by a specific user from a text?
我想删除在 PHP 中用 BBCode 制作的引号,就像这个例子:
[quote=testuser]
[quote=anotheruser]a sdasdsa dfv rdfgrgre gzdf vrdg[/quote]
sdfsd fdsf dsf sdf[/quote]
the rest of the post text
我正在考虑建立一个屏蔽系统,这样用户就不必看到他们不想看到的内容。所以说 "testuser" 被阻止了,他们不想要整个引用部分,包括嵌套在里面的第二个引用,因为它是主引用的一部分。
所以 post 将只剩下:
the rest of the post text
我想知道这样做的最佳方法。我希望正则表达式,但它比我想的更复杂,我有这个尝试:
/\[quote\=testuser\](.*)\[\/quote\]/is
但是,这会捕获所有结束引号标记。
是否有其他快速的替代方法,或者可以很好地修复我的正则表达式?
总而言之:删除被阻止用户的初始报价以及该报价内的所有内容,但除此之外别无其他。
使用单个正则表达式无法实现您想要的。我建议扫描文件,直到找到 [quote=testuser]
。找到它后,设置一个布尔值以开始过滤,并将计数器设置为 1。在布尔值为真后,为您遇到的每个 [quote=...]
标记增加计数器。为遇到的每个 [/quote]
标记递减计数器。当计数器达到 0 时,将用于过滤的布尔值更改为 false。
这是一些sudocode。您可能需要根据您的应用程序对其进行一些修改,但我认为它显示了要使用的一般算法。
filtering = false
counter = 0
for each line:
if line contains "[quote=testuser]"
filtering = true
counter = 0
if line contains "[quote="
counter += 1
if line contains "[/quote]
counter -= 1
if counter = 0
filtering = false
if not filtering
print line
据我所知,这不是一个简单的过程。这是我的步骤...
- 使用
preg_split()
将输入字符串分成 3 种方式:左引号标签、右引号标签和其他。我在开始和结束标签上分开,但使用 DELIM_CAPTURE
将它们保存在输出数组和原始 position/order 中。使用 NO_EMPTY
是为了在 foreach 循环中没有无用的迭代。
- 遍历生成的数组并搜索要省略的用户名。
- 当找到目标用户的引用时,存储该元素的起始索引,并将
$open
设置为 1。
- 每当找到新的开头引号标签时,
$open
就会递增。
- 每当找到新的结束引号标记时,
$open
就会递减。
- 一旦
$open
达到 0
,$start
和 end
索引将被馈送到 range()
以生成一个数组,其中包含两点。
array_flip()
,当然,将值移动到键。
array_diff_key()
从 preg_split()
. 生成的数组中移除点的范围
- 如果一切顺利,
implode()
会将子字符串粘合在一起,只保留所需的组件。
函数声明:(Demo)
/*
This function DOES NOT validate the $bbcode string to contain a balanced number of opening & closing tags.
This funcion DOES check that there are enough closing tags to conclude a targeted opening tag.
*/
function omit_user_quotes($bbcode, $user) {
$substrings = preg_split('~(\[/?quote[^]]*\])~', $bbcode, 0, PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY);
$opens = 0; // necessary declaration to avoid Notice when no quote tags in $bbcode string
foreach ($substrings as $index => $substring) {
if (!isset($start) && $substring == "[quote={$user}]") { // found targeted user's first opening quote
$start = $index; // disqualify the first if statement and start searching for end tag
$opens = 1; // $opens counts how many end tags are required to conclude quote block
} elseif (isset($start)) {
if (strpos($substring, '[quote=') !== false) { // if a nested opening quote tag is found
++$opens; // increment and continue looking for closing quote tags
} elseif (strpos($substring, '[/quote]') !== false) { // if a closing quote tag is found
--$opens; // decrement and check for quote tag conclusion or error
if (!$opens) { // if $opens is zero ($opens can never be less than zero)
$substrings = array_diff_key($substrings, array_flip(range($start, $index))); // slice away unwanted elements from input array
unset($start); // re-qualify the first if statement to allow the process to repeat
}
}
}
}
if ($opens) { // if $opens is positive
return 'Error due to opening/closing tag imbalance (too few end tags)';
} else {
return trim(implode($substrings)); // trims the whitespaces on either side of $bbcode string as feature
}
}
测试输入:
/* Single unwanted quote with nested innocent quote: */
/*$bbcode='[quote=testuser]
[quote=anotheruser]a sdasdsa dfv rdfgrgre gzdf vrdg[/quote]
sdfsd fdsf dsf sdf[/quote]
the rest of the test'; */
/* output: the rest of the test */
/* Complex battery of unwanted, wanted, and nested quotes: */
$bbcode = '[quote=mickmackusa]Keep this[/quote]
[quote=testuser]Don\'t keep this because
[quote=mickmackusa]said don\'t do it[/quote]
... like that\'s a good reason
[quote=NaughtySquid] It\'s tricky business, no?[/quote]
[quote=nester][quote=nesty][quote=nested][/quote][/quote][/quote]
[/quote]
Let\'s remove a second set of quotes
[quote=testuser]Another quote block[/quote]
[quote=mickmackusa]Let\'s do a third quote inside of my quote...
[quote=testuser]Another quote block[/quote]
[/quote]
This should be good, but
What if [quote=testuser]quotes himself [quote=testuser] inside of his own[/quote] quote[/quote]?';
/* No quotes: */
//$bbcode='This has no bbcode quote tags in it.';
/* output: This has no bbcode quote tags in it. */
/* Too few end quote tags by innocent user:
(No flag is raised because the targeted user has not quoted any text) */
//$bbcode='This [quote=mickmackusa] has not end tag.';
/* output: This [quote=mickmackusa] has not end tag. */
/* Too few end quote tags by unwanted user: */
//$bbcode='This [quote=testuser] has not end tag.';
/* output: Error due to opening/closing tag imbalance (too few end tags) */
/* Too many end quote tags by unwanted user:
(No flag is raised because the function does not validate the bbcode text as fully balanced) */
//$bbcode='This [quote=testuser] has too many end[/quote] tags.[/quote]';
/* output: This tags.[/quote] */
函数调用:
$user = 'testuser';
echo omit_user_quotes($bbcode, $user); // omit a single user's quote blocks
/* Or if you want to omit quote blocks from multiple users, you can use a loop:
$users = ['mickmackusa', 'NaughtySquid'];
foreach ($users as $user) {
$bbcode = omit_user_quotes($bbcode, $user);
}
echo $bbcode;
*/
我想删除在 PHP 中用 BBCode 制作的引号,就像这个例子:
[quote=testuser]
[quote=anotheruser]a sdasdsa dfv rdfgrgre gzdf vrdg[/quote]
sdfsd fdsf dsf sdf[/quote]
the rest of the post text
我正在考虑建立一个屏蔽系统,这样用户就不必看到他们不想看到的内容。所以说 "testuser" 被阻止了,他们不想要整个引用部分,包括嵌套在里面的第二个引用,因为它是主引用的一部分。
所以 post 将只剩下:
the rest of the post text
我想知道这样做的最佳方法。我希望正则表达式,但它比我想的更复杂,我有这个尝试:
/\[quote\=testuser\](.*)\[\/quote\]/is
但是,这会捕获所有结束引号标记。
是否有其他快速的替代方法,或者可以很好地修复我的正则表达式?
总而言之:删除被阻止用户的初始报价以及该报价内的所有内容,但除此之外别无其他。
使用单个正则表达式无法实现您想要的。我建议扫描文件,直到找到 [quote=testuser]
。找到它后,设置一个布尔值以开始过滤,并将计数器设置为 1。在布尔值为真后,为您遇到的每个 [quote=...]
标记增加计数器。为遇到的每个 [/quote]
标记递减计数器。当计数器达到 0 时,将用于过滤的布尔值更改为 false。
这是一些sudocode。您可能需要根据您的应用程序对其进行一些修改,但我认为它显示了要使用的一般算法。
filtering = false
counter = 0
for each line:
if line contains "[quote=testuser]"
filtering = true
counter = 0
if line contains "[quote="
counter += 1
if line contains "[/quote]
counter -= 1
if counter = 0
filtering = false
if not filtering
print line
据我所知,这不是一个简单的过程。这是我的步骤...
- 使用
preg_split()
将输入字符串分成 3 种方式:左引号标签、右引号标签和其他。我在开始和结束标签上分开,但使用DELIM_CAPTURE
将它们保存在输出数组和原始 position/order 中。使用NO_EMPTY
是为了在 foreach 循环中没有无用的迭代。 - 遍历生成的数组并搜索要省略的用户名。
- 当找到目标用户的引用时,存储该元素的起始索引,并将
$open
设置为 1。 - 每当找到新的开头引号标签时,
$open
就会递增。 - 每当找到新的结束引号标记时,
$open
就会递减。 - 一旦
$open
达到0
,$start
和end
索引将被馈送到range()
以生成一个数组,其中包含两点。 array_flip()
,当然,将值移动到键。array_diff_key()
从preg_split()
. 生成的数组中移除点的范围
- 如果一切顺利,
implode()
会将子字符串粘合在一起,只保留所需的组件。
函数声明:(Demo)
/*
This function DOES NOT validate the $bbcode string to contain a balanced number of opening & closing tags.
This funcion DOES check that there are enough closing tags to conclude a targeted opening tag.
*/
function omit_user_quotes($bbcode, $user) {
$substrings = preg_split('~(\[/?quote[^]]*\])~', $bbcode, 0, PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY);
$opens = 0; // necessary declaration to avoid Notice when no quote tags in $bbcode string
foreach ($substrings as $index => $substring) {
if (!isset($start) && $substring == "[quote={$user}]") { // found targeted user's first opening quote
$start = $index; // disqualify the first if statement and start searching for end tag
$opens = 1; // $opens counts how many end tags are required to conclude quote block
} elseif (isset($start)) {
if (strpos($substring, '[quote=') !== false) { // if a nested opening quote tag is found
++$opens; // increment and continue looking for closing quote tags
} elseif (strpos($substring, '[/quote]') !== false) { // if a closing quote tag is found
--$opens; // decrement and check for quote tag conclusion or error
if (!$opens) { // if $opens is zero ($opens can never be less than zero)
$substrings = array_diff_key($substrings, array_flip(range($start, $index))); // slice away unwanted elements from input array
unset($start); // re-qualify the first if statement to allow the process to repeat
}
}
}
}
if ($opens) { // if $opens is positive
return 'Error due to opening/closing tag imbalance (too few end tags)';
} else {
return trim(implode($substrings)); // trims the whitespaces on either side of $bbcode string as feature
}
}
测试输入:
/* Single unwanted quote with nested innocent quote: */
/*$bbcode='[quote=testuser]
[quote=anotheruser]a sdasdsa dfv rdfgrgre gzdf vrdg[/quote]
sdfsd fdsf dsf sdf[/quote]
the rest of the test'; */
/* output: the rest of the test */
/* Complex battery of unwanted, wanted, and nested quotes: */
$bbcode = '[quote=mickmackusa]Keep this[/quote]
[quote=testuser]Don\'t keep this because
[quote=mickmackusa]said don\'t do it[/quote]
... like that\'s a good reason
[quote=NaughtySquid] It\'s tricky business, no?[/quote]
[quote=nester][quote=nesty][quote=nested][/quote][/quote][/quote]
[/quote]
Let\'s remove a second set of quotes
[quote=testuser]Another quote block[/quote]
[quote=mickmackusa]Let\'s do a third quote inside of my quote...
[quote=testuser]Another quote block[/quote]
[/quote]
This should be good, but
What if [quote=testuser]quotes himself [quote=testuser] inside of his own[/quote] quote[/quote]?';
/* No quotes: */
//$bbcode='This has no bbcode quote tags in it.';
/* output: This has no bbcode quote tags in it. */
/* Too few end quote tags by innocent user:
(No flag is raised because the targeted user has not quoted any text) */
//$bbcode='This [quote=mickmackusa] has not end tag.';
/* output: This [quote=mickmackusa] has not end tag. */
/* Too few end quote tags by unwanted user: */
//$bbcode='This [quote=testuser] has not end tag.';
/* output: Error due to opening/closing tag imbalance (too few end tags) */
/* Too many end quote tags by unwanted user:
(No flag is raised because the function does not validate the bbcode text as fully balanced) */
//$bbcode='This [quote=testuser] has too many end[/quote] tags.[/quote]';
/* output: This tags.[/quote] */
函数调用:
$user = 'testuser';
echo omit_user_quotes($bbcode, $user); // omit a single user's quote blocks
/* Or if you want to omit quote blocks from multiple users, you can use a loop:
$users = ['mickmackusa', 'NaughtySquid'];
foreach ($users as $user) {
$bbcode = omit_user_quotes($bbcode, $user);
}
echo $bbcode;
*/