为什么在以下情况下使用 strip_tags 函数后 HTML 标签没有被删除?

Why the HTML tags are not getting removed even after using strip_tags function in following scenario?

我有一个名为 $aMessages 的数组。实际上它是一个相当大的数组,但为了您的参考,我只在下面打印它的前三个元素:

    [0] => Array
            [message_id] => 240
            [thread_id] => 43
            [user_id] => 244
            [text] => test msg<div class="mail_attach_image"><a class="group1" href="" ><img src=""  /></a><br><a class="mail_attach_image_link_dwl"  href=" 2015/month_04/file_49c79e88b24a8fff8104909fce19aa3f.png" >Download</a></div>
            [time_stamp] => 1429695832
            [total_attachment] => 0
            [is_mobile] => 0
            [has_forward] => 0
            [profile_page_id] => 0
            [user_server_id] => 0
            [user_name] => profile-244
            [full_name] => CampusKnot .
            [gender] => 1
            [user_image] => 2015/03/ae6f1665efc29eb3360d392bbcd183b7%s.jpg
            [is_invisible] => 0
            [user_group_id] => 7
            [language_id] => �
            [forwards] => Array


    [1] => Array
            [message_id] => 241
            [thread_id] => 43
            [user_id] => 901
            [text] => hi
            [time_stamp] => 1429695875
            [total_attachment] => 0
            [is_mobile] => 0
            [has_forward] => 0
            [profile_page_id] => 0
            [user_server_id] => 1
            [user_name] => profile-901
            [full_name] => Student Campusknot
            [gender] => 2
            [user_image] => 2014/11/b23e023750785c8b5e61ace4d6a202fa%s.png
            [is_invisible] => 0
            [user_group_id] => 6
            [language_id] => �
            [forwards] => Array


    [2] => Array
            [message_id] => 243
            [thread_id] => 43
            [user_id] => 244
            [text] => textmessage
            [time_stamp] => 1429710052
            [total_attachment] => 0
            [is_mobile] => 0
            [has_forward] => 0
            [profile_page_id] => 0
            [user_server_id] => 0
            [user_name] => profile-244
            [full_name] => CampusKnot .
            [gender] => 1
            [user_image] => 2015/03/ae6f1665efc29eb3360d392bbcd183b7%s.jpg
            [is_invisible] => 0
            [user_group_id] => 7
            [language_id] => �
            [forwards] => Array


如果您仔细观察第一个元素的 ['text'] 键,则存在一些 HTML 代码。我想删除这个 HTML 代码并保留文本值(在这种情况下,值“test msg”应该只保留在那里,所有其他 HTML 代码应该被删除)。

所以基本上我想要的是检查每个元素的 ['text'] 键值是否存在 HTML 代码。

如果存在 HTML 代码,则应将其删除,只保留纯文本。


foreach($aMessages as $key => $value) {
  $value['text'] = strip_tags($value['text']);




foreach($aMessages as $key => $value) {
    $aMessages[$key]['text'] = strip_tags($value['text']);

foreach 创建数组的副本。您的 $value 不会在原始数组中更改。要么更改原始数组中的值,要么通过引用分配 $value

引用 http://php.net/manual/en/control-structures.foreach.php

In order to be able to directly modify array elements within the loop precede $value with &. In that case the value will be assigned by reference.

另见 How does PHP 'foreach' actually work?


My issue is the string between HTML anchor tags is not getting ignored.

是的,strip_tags 会如其名。它剥离标签。但不是他们的内容。

or simply cut off everything after the first <。第一种方法需要更多代码。后者不太可靠,因为文本可能包含不是标记的 小于

可靠性和代码量之间的一个很好的权衡是 compare the original string against the stripped string 然后只有 return 从开始到第一个不同字符的子字符串,例如

$text = substr($string, 0, strspn($string ^ strip_tags($string), "[=10=]"));

请注意,这些方法中的 none 考虑到标签外可能有文本,例如textMsg<b>foo</b>bar<i>baz</i>end 只会产生 "textMsg"。如果你想要 "textMsg bar end" 使用 DOM 像这样:

$string = 'textMsg<b>foo</b>bar<i>baz</i>end';

$dom = new DOMDocument;
$dom->loadHTML('<div id="root">' . $string . '</div>');
$xpath = new DOMXPath($dom);
$combinedDirectTextNodes = [];
foreach ($xpath->evaluate('id("root")/text()') as $text) {
    $combinedDirectTextNodes[] = $text->nodeValue;

echo implode(' ', $combinedDirectTextNodes); // textMsg bar end

如果我没记错的话,这对你有用,如果你关心的只是获得 test msg 其他使用 strip_tags 它会去除所有标签,但其余的数据会在那里

foreach($aMessages as $key => &$value){
    $aMessages[$key]['text'] = substr($value['text'],0,strpos($value['text'],'<'));
print_r($aMessages); //Array ( [0] => Array ( [message_id] => 240 [thread_id] => 43 [user_id] => 244 [text] => test msg [time_stamp] => 1429695832 [total_attachment] => 0 [is_mobile] => 0 ) [1] => Array ( [message_id] => 241 [thread_id] => 43 [user_id] => 901 [text] => [time_stamp] => 1429695875 [total_attachment] => 0 [is_mobile] => 0 ) [2] => Array ( [message_id] => 243 [thread_id] => 43 [user_id] => 244 [text] => [time_stamp] => 1429710052 [total_attachment] => 0 [is_mobile] => 0 ) )
  1. 您应该在 foreach 循环中修改原始数组而不是克隆数组。
  2. 您不能使用 strip_tags() 删除标签和标签内的内容。 strip_tags() 只删除标签。


function strip_tags_content($text, $tags = '', $invert = FALSE) { 

  preg_match_all('/<(.+?)[\s]*\/?[\s]*>/si', trim($tags), $tags); 
  $tags = array_unique($tags[1]); 

  if(is_array($tags) AND count($tags) > 0) { 
    if($invert == FALSE) { 
      return preg_replace('@<(?!(?:'. implode('|', $tags) .')\b)(\w+)\b.*.*?</>@si', '', $text); 
    else { 
      return preg_replace('@<('. implode('|', $tags) .')\b.*.*?</>@si', '', $text); 
  elseif($invert == FALSE) { 
    return preg_replace('@<(\w+)\b.*.*?</>@si', '', $text); 
  return $text; 

foreach($aMessages as $key => $value) {
  $value['text'] = strip_tags_content($value['text']);