从文本(一篇文章)中删除不需要的 html 代码

Remove unwanted html code from text (an article)

我有一个刚从 HTML 迁移过来的 joomla 网站。有 1000 篇文章,每篇文章都包含不需要的 HTML 代码,如下所示。 我怎样才能删除这些文章中的 HTML 而不必打开每篇文章进行编辑?

<div id="mainDIV">
<div id="topDIV">
<div id="topnav">
<div>
<div id="topnavdiv0"> </div>
<div id="topnavdiv"><a href="../store/">SHOP NOW</a> <img title="" src="images/shop-basket.gif" />  |  1-800-336-1630</div>
</div>
</div>
</div>
<div style="clear: both;"> </div>

<table id="mainBody" >
<tbody>
<tr>
<td id="left"> </td>
<td id="mid"><!-- top -->
<div id="top1">
<div id="bbb-logo"><a href="http://app.southeasttexas.bbb.org/report/10014674/"><img src="images/logo-bbb.gif" alt="metal-market-report-02-27-12" /></a></div>
</div>
<!--div id="top2"></div-->
<div id="flashnav"> </div>
<div id="topsep"> </div>
<!-- top --> <!-- content -->
<table id="contentBody">
<tbody>
<tr>
<td id="contentSep"> </td>
<td id="contentLeft">
<div id="titleBGlong">Metals Market Reports</div>
<br />

我真希望我不必再回来问同样的问题,但即使删除了所有问题,我仍然会出错; 请看下面的错误:

There seems to be an error in your SQL query. The MySQL server error output below, if there is any, may also help you in diagnosing the problem

ERROR: Unknown Punctuation String @ 1
STR: <?
SQL: <?php
$query = mysqli_query($con, 'SELECT * FROM th18k_content WHERE id BETWEEN 0 AND 50');

SQL query: Documentation

MySQL said: Documentation

#1064 - You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '<?php
$query = mysqli_query($con, 'SELECT * FROM th18k_content WHERE id BETWEEN' at line 1 

您想去掉文章中的 HTML 标签吗?首先在 table 中找到存储在您的数据库中的那些文章,然后获取它们并使用

浏览它们
<?php
$query = mysqli_query($con, 'SELECT * FROM th18k_content WHERE id BETWEEN 0 AND 50');
                                              //get articles from database
while ($row= mysqli_fetch_array($query, MYSQLI_ASSOC)) { //for each article
  $lines = explode('\n',$row['article']);                      //split it into lines
  for($i=0;$i<sizeof($lines);$i++)                     //so we can remove
  {                                            //the ones that we don't need
    if(strpos($line,'titleBGlong') === false) //if 'titleBGlong' isn't found...
    {
      unset($lines[$i]);                       //remove the line
    }
    else 
    {
      $newarticle = implode('\n',$lines);     //else put it back together
      break;                                  //and exit the loop
    }                          //now the $newarticle has the beginning removed
  }
  $strippedarticle = strip_tags($newarticle );//remove HTML tags
  mysqli_query($con, 'UPDATE th18k_content SET article = "'.$strippedarticle.'" WHERE id = '.$row['id']);
}                                             //replace the article in the db
?>

我不知道你的数据库列和 table 到底叫什么,所以你需要更改它。此外,我在 0 到 50 之间这样做,因为你可能会用查询淹没数据库,因为每篇文章需要 2 个查询(只是 运行 代码,更改为下一个 50 并再次 运行,等等)

@编辑 该脚本可以 运行 通过将其保存在服务器上的 .php 文件中并像普通网站页面一样 运行 将其 运行 设置(在此示例中我没有连接到数据库)

这将删除所有行,直到找到 "titleBGlong",然后您可以使用 strip_tags 删除标签