preg_match 获取 href 中的文本

Question

HTML:

 <td class="td_class"><a href="javascript:goRead('115');" onmouseover="status='read';return true;" onmouseout="status=''" onfocus="blur()">Title</a></td>

我需要制作 preg_match 才能获得标题，我已经尝试使用此正则表达式

preg_match_all('/[^>]class=["\']td_class[\'"]*>(.*?)<\//',$result,$match);
    $datas['title'] = $match[1];
    var_dump($datas['title']);

结果是

 <a href="javascript:goRead('115');" onmouseover="status='read';return true;" onmouseout="status=''" onfocus="blur()">Title</a>

但是我只想得到标题，有人知道怎么做吗？谢谢！

Answer 1

DomDocument 效果很好，doc here.

一个简单的例子

  //This steps is useful if you want to parse html of a website
  $html = file_get_contents('www.pathtohtml.com');
  $doc = new DOMDocument();
  //if you want to load html file you can use loadHtmlFile
  $doc->loadHTML($html); //This load html string
  $aTags = $doc->getElementsByTagName('a'); 
  foreach ($aTags as $aTag) {
    //$aTag->nodeValue this contain your A tag text node!
    //You can also access attributes ..
  }

如果您需要更精确地查询 Dom 我建议您 XPATH。

希望对您有所帮助。

preg_match 获取 href 中的文本

pregmatch to get text inside a href

php

preg-match-all