如何获取标签后的文字
How to get text after tag
我正在使用简单 HTML DOM 解析器 php
我不明白如何在标签后获取文本,例如 (<b></b> Text
)。
请看下图。我正在访问这个网站并得到这个 HTML
我想从上面的图像详细信息中创建这样的 array():
array(
'release_year'=> 2009,
'genre' => 'Drama,Fantasy,Horror',
'description' => 'etc etc etc',
'imdb' => 'link of imdb',
'total_episode'=> '28 episode',
'latest_episode_title'=> 'title',
'latest_episode_link' => 'link',
'latest_episode_with_link_title'=> 'title',
'latest_episode_with_link_link' => 'link',
);
我已成功获取标签 <b></b>
下的文本,但我不知道如何获取 HTML 中显示的 <b>
标签后的文本。请查看它以及我的 PHP 代码和结果,请解决我的问题。非常感谢你。
这里是上图HTML:
<div class="show-summary">
<table border="0" style="padding:3px">
<tbody>
<tr>
<td style="padding:3px">
<a href="/serie/the_vampire_diaries">
<img src="http://static1.watchseries.ag/90/1/The_Vampire_Diaries-18597.JPEG" alt="Watch Series - The Vampire Diaries" title="Watch Series - The Vampire Diaries" height="120px" width="85px">
</a>
</td>
<td valign="top" style="padding:3px">
<p>
<b>Release Year: </b>
2009<br>
<b>Genre: <a href="/genres/Drama">Drama</a>, <a href="/genres/Fantasy">Fantasy</a>, <a href="/genres/Horror">Horror</a></b>
<br>
<b>External Links: </b>
<a href="http://www.imdb.com/title/tt1405406/" target="_blank">IMDB</a>
<br>
<b>No. of episodes: </b>
128 episodes <br>
<b>Latest Episode: </b>
<a title="Watch The Vampire Diaries Latest Episode (The Vampire Diaries Season 6 Episode 16)" href="/episode/the_vampire_diaries_s6_e16.html">Season 6 Episode 16 The Downward Spiral (26/02/2015)</a>
<br>
<b>Latest Episode With Links: </b>
<a title="Watch The Vampire Diaries Latest Episode (The Vampire Diaries Season 6 Episode 11)" href="/episode/the_vampire_diaries_s6_e11.html">Season 6 Episode 11 Woke Up With a Monster (22/01/2015)</a>
<br>
</p>
<div style="float: left; height: 30px; overflow: hidden; width: 100px;">
<div class="fb-like fb_iframe_widget" data-href="http://watchseries.ag/serie/the_vampire_diaries" data-send="false" data-layout="button_count" data-show-faces="false" fb-xfbml-state="rendered" fb-iframe-plugin-query="app_id=434603673340441&href=http%3A%2F%2Fwatchseries.ag%2Fserie%2Fthe_vampire_diaries&layout=button_count&locale=en_US&sdk=joey&send=false&show_faces=false">
<span style="vertical-align: bottom; width: 79px; height: 20px;">
<iframe name="fbc5b3f58" width="1000px" height="1000px" frameborder="0" allowtransparency="true" scrolling="no" title="fb:like Facebook Social Plugin" src="http://www.facebook.com/plugins/like.php?app_id=434603673340441&channel=http%3A%2F%2Fstatic.ak.facebook.com%2Fconnect%2Fxd_arbiter%2F7r8gQb8MIqE.js%3Fversion%3D41%23cb%3Df314058a5%26domain%3Dwatchseries.ag%26origin%3Dhttp%253A%252F%252Fwatchseries.ag%252Ff5fff1c%26relation%3Dparent.parent&href=http%3A%2F%2Fwatchseries.ag%2Fserie%2Fthe_vampire_diaries&layout=button_count&locale=en_US&sdk=joey&send=false&show_faces=false" style="border: none; visibility: visible; width: 79px; height: 20px;" class="" __idm_id__="824321"></iframe>
</span>
</div>
</div>
<iframe id="twitter-widget-1" scrolling="no" frameborder="0" allowtransparency="true" src="http://platform.twitter.com/widgets/tweet_button.b68aed79dd9ad79554bcd8c9141c94c8.en.html#_=1422079075304&count=horizontal&dnt=false&id=twitter-widget-1&lang=en&original_referer=http%3A%2F%2Fwatchseries.ag%2Fserie%2Fthe_vampire_diaries&size=m&text=Watch%20The%20Vampire%20Diaries%20Serie%20Online%20-%20Watch%20Series&url=http%3A%2F%2Fwatchseries.ag%2Fserie%2Fthe_vampire_diaries" class="twitter-share-button twitter-tweet-button twitter-share-button twitter-count-horizontal" title="Twitter Tweet Button" data-twttr-rendered="true" style="width: 107px; height: 20px;"></iframe>
<script>!function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0],p=/^http:/.test(d.location)?'http':'https';if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src=p+'://platform.twitter.com/widgets.js';fjs.parentNode.insertBefore(js,fjs);}}(document, 'script', 'twitter-wjs');</script>
<br clear="all">
<b>Description :</b>
The vampire brothers Damon and Stefan Salvatore, eternal adolescents, having been leading 'normal' lives, hiding their bloodthirsty condition, for centuries, moving on before their non-aging is noticed.
<span id="plot_mored"> They are back in the Virginia town where they became vampires. Stefan is noble, denying himself blood to avoid killing, and tries to control his evil brother Damon. Stefan falls in love with schoolgirl Elena, whose best friend is a witch, like her grandma.</span>
<a onclick="return showMoreContent('plot_mored');" class="small dark" href="#" id="more" style="display: none;">[+]more</a>
<br>
<p></p>
</td>
</tr>
</tbody>
</table>
</div>
这是我的 PHP 代码:
$html = new simple_html_dom();
$html->load_file("LINK");
foreach($html->find('div.show-summary table tbody tr') as $rowz){
foreach($rowz->find('p') as $p){
foreach($p->find('b') as $b){
echo $b->innertext.'<br/>';
}
}
}
通过运行上面的代码我得到以下结果:
Release Year:
Genre: Drama, Fantasy, Horror
External Links:
No. of episodes:
Latest Episode:
Latest Episode With Links:
Description :
我想创建一个包含以上图像细节的数组。
您是否尝试过在搜索中添加标签 p 和 b:
$html->find('div.show-summary table tbody tr p b')
这只是一种方法,并不完整,但可以为您提供一个思路。
获取年份发布数据,有点棘手,应该有更好的方法,但有效:
$html = new simple_html_dom();
$html->load_file('yourhtmlfile.html');
# set the 'mapping': map the search to the field you need
$map = array(
array(
'query'=>'div.show-summary table tbody tr p',
'nodeIndex'=>0,
'attribute'=>'',
'method'=>'innertext',
'extract_string'=>'',
'get_string_between'=>array(
'start'=>'<b>Release Year: </b>',
'end'=>'<br>',
),
'field'=>'Release Year',
),
array(
'query'=>'div.show-summary table tbody tr p b',
'nodeIndex'=>1,
'attribute'=>'',
'method'=>'plaintext',
'extract_string'=>'Genre: ',
'get_string_between'=>'',
'field'=>'Genre',
),
array(
'query'=>'div.show-summary table tbody tr p a',
'nodeIndex'=>4,
'attribute'=>'title',
'method'=>'',
'extract_string'=>'',
'get_string_between'=>'',
'field'=>'Latest Episode title',
),
array(
'query'=>'div.show-summary table tbody tr p a',
'nodeIndex'=>4,
'attribute'=>'href',
'method'=>'',
'extract_string'=>'',
'get_string_between'=>'',
'field'=>'Latest Episode link',
),
);
# the resulting array with fields values
$fieldsResult = array();
foreach($map as $search)
{
# get the search result node
$node = $html->find($search['query'],$search['nodeIndex']);
# get the node attributes
$node_attributes = $node->attr;
# attribute set in the map? get it.
$content = $search['attribute']!=''
? $node_attributes[$search['attribute']]
: '';
# method set in the map? get it
$content = $search['method']!=''
? $node->{$search['method']}
: $content;
# string to be cleaned? extract it
$result = $search['extract_string']!=''
? str_replace($search['extract_string'], '', $content)
: $content;
# get content from within to string marks
if($search['get_string_between']!=0)
{
$result = trim($result);
$init_length = strlen($search['get_string_between']['start']);
$end_length = strlen($search['get_string_between']['end']);
$init_pos = strpos($result, $search['get_string_between']['start']);
$end_pos = strpos($result, $search['get_string_between']['end']);
$substring_start = $init_pos + $init_length;
$substring = trim(substr($result, $substring_start, $end_pos));
$result = str_replace($search['get_string_between']['end'], '', $substring);
}
# final result
$fieldsResult[$search['field']] = $result;
}
var_dump($fieldsResult);
////////////
// OUTPUT //
////////////
array (size=4)
'Release Year' => string '2009' (length=4)
'Genre' => string 'Drama, Fantasy, Horror' (length=22)
'Latest Episode title' => string 'Watch The Vampire Diaries Latest Episode (The Vampire Diaries Season 6 Episode 16)' (length=82)
'Latest Episode link' => string '/episode/the_vampire_diaries_s6_e16.html' (length=40)
如果文件更改很多,这可能不是您想要的,但是如果您按照
$html = new simple_html_dom();
$html->load_file("LINK");
foreach($html->find('div.show-summary table tbody tr') as $rowz){
foreach($rowz->find('p') as $p){
$matches = explode('<br>',$p->innertext);
foreach ($matches as $entry) {
preg_match('/<b>(.*)</b>(.*)/i', $entry, $stuff);
echo "{$stuff[1]} => $stuff[2]";
}
}
}
抱歉,您可能需要 cleanup/fiddle 才能按照您的意愿使用它。并检查 bad/undefined 个条目....
大家好,现在我有了一个完整的解决方案,我已经做了很多研究代码,它会按照我想要的方式执行这里是这个功能,请检查一下
<?php
function do_html_array($td,$dlm='<br>'){
if(!empty($td)){
$td = html_entity_decode($td);
$td = preg_replace('/<script\b[^>]*>(.*?)<\/script>/is', "", $td);
$html_array = explode($dlm,$td);
$html_key_array = array();
foreach($html_array as $key=>$html){
$html = explode(':',trim(strip_tags($html)));
if(trim($html[0])!=''){
if(count($html)<1) $html[1] = '';
if(strtolower(trim($html[0]))=='description') $html[1] = str_ireplace('[+]more','',$html[1]);
$html_key_array[strtolower(trim($html[0]))] = trim($html[1]);
switch(trim(strtolower($html[0]))){
case'external links':
preg_match_all('~<a\s+.*?</a>~is',$html_array[$key],$html_key_array['imdb_link']);
break;
case'genre':
preg_match_all('~<a\s+.*?</a>~is',$html_array[$key],$html_key_array['genre_link']);
break;
// further define here...
}
}
}
return $html_key_array;
}
return false;
}
$td = '<td valign="top" style="padding:3px"><p><b>Release Year: </b>2007<br><b>Genre: <a href="/genres/Comedy">Comedy</a></b><br><b>External Links: </b> <a target="_blank" href="http://www.imdb.com/title/tt0898266/">IMDB</a> <br><b>No. of episodes: </b> 178 episodes <br><b>Latest Episode: </b> <a href="/episode/big_bang_theory_s8_e16.html" title="Watch The Big Bang Theory Latest Episode (The Big Bang Theory Season 8 Episode 16)">Season 8 Episode 16 The Intimacy Acceleration (01/01/1970)</a><br><b>Latest Episode With Links: </b> <a href="/episode/big_bang_theory_s8_e13.html" title="Watch The Big Bang Theory Latest Episode (The Big Bang Theory Season 8 Episode 13)">Season 8 Episode 13 The Anxiety Optimization (15/01/2015)</a><br></p><div style="float: left; height: 30px; overflow: hidden; width: 100px;"><div data-show-faces="false" data-layout="button_count" data-send="false" data-href="http://watchseries.ag/serie/big_bang_theory" class="fb-like fb_iframe_widget" fb-xfbml-state="rendered" fb-iframe-plugin-query="app_id=434603673340441&href=http%3A%2F%2Fwatchseries.ag%2Fserie%2Fbig_bang_theory&layout=button_count&locale=en_US&sdk=joey&send=false&show_faces=false"><span style="vertical-align: bottom; width: 80px; height: 20px;"><iframe width="1000px" height="1000px" frameborder="0" name="f225e71df2e6d02" allowtransparency="true" scrolling="no" title="fb:like Facebook Social Plugin" style="border: medium none; visibility: visible; width: 80px; height: 20px;" src="http://www.facebook.com/plugins/like.php?app_id=434603673340441&channel=http%3A%2F%2Fstatic.ak.facebook.com%2Fconnect%2Fxd_arbiter%2FDU1Ia251o0y.js%3Fversion%3D41%23cb%3Df1f47ad29892336%26domain%3Dwatchseries.ag%26origin%3Dhttp%253A%252F%252Fwatchseries.ag%252Ff18c568fa0d51e4%26relation%3Dparent.parent&href=http%3A%2F%2Fwatchseries.ag%2Fserie%2Fbig_bang_theory&layout=button_count&locale=en_US&sdk=joey&send=false&show_faces=false" class=""></iframe></span></div></div><iframe frameborder="0" id="twitter-widget-1" scrolling="no" allowtransparency="true" src="http://platform.twitter.com/widgets/tweet_button.67ae45a68af44ab435dd5797206058d3.en.html#_=1422780550826&count=horizontal&dnt=false&id=twitter-widget-1&lang=en&original_referer=http%3A%2F%2Fwatchseries.ag%2Fserie%2Fbig_bang_theory&size=m&text=Watch%20The%20Big%20Bang%20Theory%20Serie%20Online%20-%20Watch%20Series&url=http%3A%2F%2Fwatchseries.ag%2Fserie%2Fbig_bang_theory" class="twitter-share-button twitter-tweet-button twitter-share-button twitter-count-horizontal" title="Twitter Tweet Button" data-twttr-rendered="true" style="width: 109px; height: 20px;"></iframe><script>!function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0],p=/^http:/.test(d.location)?\'http\':\'https\';if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src=p+\'://platform.twitter.com/widgets.js\';fjs.parentNode.insertBefore(js,fjs);}}(document, \'script\', \'twitter-wjs\');</script><br clear="all"><b>Description :</b> A woman who moves into an apartment across the hall from two brilliant but socially awkward physicists shows them how little they know about life outside of the laboratory.<br><p></p></td>';
$html_array = do_html_array($td);
if($html_array){
foreach($html_array as $key=>$value){
if(is_array($value)){
echo "<strong>$key</strong>:";
foreach($value[0] as $link){
echo "$link , ";
}
echo "<br>--------------------------------<br>";
}else{
echo "<strong>$key</strong>: $value";
echo "<br>--------------------------------<br>";
}
}
}
?>
我上面的函数获取所有文本并将它们保存在数组键值对中:)
我正在使用简单 HTML DOM 解析器 php
我不明白如何在标签后获取文本,例如 (<b></b> Text
)。
请看下图。我正在访问这个网站并得到这个 HTML
我想从上面的图像详细信息中创建这样的 array():
array(
'release_year'=> 2009,
'genre' => 'Drama,Fantasy,Horror',
'description' => 'etc etc etc',
'imdb' => 'link of imdb',
'total_episode'=> '28 episode',
'latest_episode_title'=> 'title',
'latest_episode_link' => 'link',
'latest_episode_with_link_title'=> 'title',
'latest_episode_with_link_link' => 'link',
);
我已成功获取标签 <b></b>
下的文本,但我不知道如何获取 HTML 中显示的 <b>
标签后的文本。请查看它以及我的 PHP 代码和结果,请解决我的问题。非常感谢你。
这里是上图HTML:
<div class="show-summary">
<table border="0" style="padding:3px">
<tbody>
<tr>
<td style="padding:3px">
<a href="/serie/the_vampire_diaries">
<img src="http://static1.watchseries.ag/90/1/The_Vampire_Diaries-18597.JPEG" alt="Watch Series - The Vampire Diaries" title="Watch Series - The Vampire Diaries" height="120px" width="85px">
</a>
</td>
<td valign="top" style="padding:3px">
<p>
<b>Release Year: </b>
2009<br>
<b>Genre: <a href="/genres/Drama">Drama</a>, <a href="/genres/Fantasy">Fantasy</a>, <a href="/genres/Horror">Horror</a></b>
<br>
<b>External Links: </b>
<a href="http://www.imdb.com/title/tt1405406/" target="_blank">IMDB</a>
<br>
<b>No. of episodes: </b>
128 episodes <br>
<b>Latest Episode: </b>
<a title="Watch The Vampire Diaries Latest Episode (The Vampire Diaries Season 6 Episode 16)" href="/episode/the_vampire_diaries_s6_e16.html">Season 6 Episode 16 The Downward Spiral (26/02/2015)</a>
<br>
<b>Latest Episode With Links: </b>
<a title="Watch The Vampire Diaries Latest Episode (The Vampire Diaries Season 6 Episode 11)" href="/episode/the_vampire_diaries_s6_e11.html">Season 6 Episode 11 Woke Up With a Monster (22/01/2015)</a>
<br>
</p>
<div style="float: left; height: 30px; overflow: hidden; width: 100px;">
<div class="fb-like fb_iframe_widget" data-href="http://watchseries.ag/serie/the_vampire_diaries" data-send="false" data-layout="button_count" data-show-faces="false" fb-xfbml-state="rendered" fb-iframe-plugin-query="app_id=434603673340441&href=http%3A%2F%2Fwatchseries.ag%2Fserie%2Fthe_vampire_diaries&layout=button_count&locale=en_US&sdk=joey&send=false&show_faces=false">
<span style="vertical-align: bottom; width: 79px; height: 20px;">
<iframe name="fbc5b3f58" width="1000px" height="1000px" frameborder="0" allowtransparency="true" scrolling="no" title="fb:like Facebook Social Plugin" src="http://www.facebook.com/plugins/like.php?app_id=434603673340441&channel=http%3A%2F%2Fstatic.ak.facebook.com%2Fconnect%2Fxd_arbiter%2F7r8gQb8MIqE.js%3Fversion%3D41%23cb%3Df314058a5%26domain%3Dwatchseries.ag%26origin%3Dhttp%253A%252F%252Fwatchseries.ag%252Ff5fff1c%26relation%3Dparent.parent&href=http%3A%2F%2Fwatchseries.ag%2Fserie%2Fthe_vampire_diaries&layout=button_count&locale=en_US&sdk=joey&send=false&show_faces=false" style="border: none; visibility: visible; width: 79px; height: 20px;" class="" __idm_id__="824321"></iframe>
</span>
</div>
</div>
<iframe id="twitter-widget-1" scrolling="no" frameborder="0" allowtransparency="true" src="http://platform.twitter.com/widgets/tweet_button.b68aed79dd9ad79554bcd8c9141c94c8.en.html#_=1422079075304&count=horizontal&dnt=false&id=twitter-widget-1&lang=en&original_referer=http%3A%2F%2Fwatchseries.ag%2Fserie%2Fthe_vampire_diaries&size=m&text=Watch%20The%20Vampire%20Diaries%20Serie%20Online%20-%20Watch%20Series&url=http%3A%2F%2Fwatchseries.ag%2Fserie%2Fthe_vampire_diaries" class="twitter-share-button twitter-tweet-button twitter-share-button twitter-count-horizontal" title="Twitter Tweet Button" data-twttr-rendered="true" style="width: 107px; height: 20px;"></iframe>
<script>!function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0],p=/^http:/.test(d.location)?'http':'https';if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src=p+'://platform.twitter.com/widgets.js';fjs.parentNode.insertBefore(js,fjs);}}(document, 'script', 'twitter-wjs');</script>
<br clear="all">
<b>Description :</b>
The vampire brothers Damon and Stefan Salvatore, eternal adolescents, having been leading 'normal' lives, hiding their bloodthirsty condition, for centuries, moving on before their non-aging is noticed.
<span id="plot_mored"> They are back in the Virginia town where they became vampires. Stefan is noble, denying himself blood to avoid killing, and tries to control his evil brother Damon. Stefan falls in love with schoolgirl Elena, whose best friend is a witch, like her grandma.</span>
<a onclick="return showMoreContent('plot_mored');" class="small dark" href="#" id="more" style="display: none;">[+]more</a>
<br>
<p></p>
</td>
</tr>
</tbody>
</table>
</div>
这是我的 PHP 代码:
$html = new simple_html_dom();
$html->load_file("LINK");
foreach($html->find('div.show-summary table tbody tr') as $rowz){
foreach($rowz->find('p') as $p){
foreach($p->find('b') as $b){
echo $b->innertext.'<br/>';
}
}
}
通过运行上面的代码我得到以下结果:
Release Year:
Genre: Drama, Fantasy, Horror
External Links:
No. of episodes:
Latest Episode:
Latest Episode With Links:
Description :
我想创建一个包含以上图像细节的数组。
您是否尝试过在搜索中添加标签 p 和 b:
$html->find('div.show-summary table tbody tr p b')
这只是一种方法,并不完整,但可以为您提供一个思路。 获取年份发布数据,有点棘手,应该有更好的方法,但有效:
$html = new simple_html_dom();
$html->load_file('yourhtmlfile.html');
# set the 'mapping': map the search to the field you need
$map = array(
array(
'query'=>'div.show-summary table tbody tr p',
'nodeIndex'=>0,
'attribute'=>'',
'method'=>'innertext',
'extract_string'=>'',
'get_string_between'=>array(
'start'=>'<b>Release Year: </b>',
'end'=>'<br>',
),
'field'=>'Release Year',
),
array(
'query'=>'div.show-summary table tbody tr p b',
'nodeIndex'=>1,
'attribute'=>'',
'method'=>'plaintext',
'extract_string'=>'Genre: ',
'get_string_between'=>'',
'field'=>'Genre',
),
array(
'query'=>'div.show-summary table tbody tr p a',
'nodeIndex'=>4,
'attribute'=>'title',
'method'=>'',
'extract_string'=>'',
'get_string_between'=>'',
'field'=>'Latest Episode title',
),
array(
'query'=>'div.show-summary table tbody tr p a',
'nodeIndex'=>4,
'attribute'=>'href',
'method'=>'',
'extract_string'=>'',
'get_string_between'=>'',
'field'=>'Latest Episode link',
),
);
# the resulting array with fields values
$fieldsResult = array();
foreach($map as $search)
{
# get the search result node
$node = $html->find($search['query'],$search['nodeIndex']);
# get the node attributes
$node_attributes = $node->attr;
# attribute set in the map? get it.
$content = $search['attribute']!=''
? $node_attributes[$search['attribute']]
: '';
# method set in the map? get it
$content = $search['method']!=''
? $node->{$search['method']}
: $content;
# string to be cleaned? extract it
$result = $search['extract_string']!=''
? str_replace($search['extract_string'], '', $content)
: $content;
# get content from within to string marks
if($search['get_string_between']!=0)
{
$result = trim($result);
$init_length = strlen($search['get_string_between']['start']);
$end_length = strlen($search['get_string_between']['end']);
$init_pos = strpos($result, $search['get_string_between']['start']);
$end_pos = strpos($result, $search['get_string_between']['end']);
$substring_start = $init_pos + $init_length;
$substring = trim(substr($result, $substring_start, $end_pos));
$result = str_replace($search['get_string_between']['end'], '', $substring);
}
# final result
$fieldsResult[$search['field']] = $result;
}
var_dump($fieldsResult);
////////////
// OUTPUT //
////////////
array (size=4)
'Release Year' => string '2009' (length=4)
'Genre' => string 'Drama, Fantasy, Horror' (length=22)
'Latest Episode title' => string 'Watch The Vampire Diaries Latest Episode (The Vampire Diaries Season 6 Episode 16)' (length=82)
'Latest Episode link' => string '/episode/the_vampire_diaries_s6_e16.html' (length=40)
如果文件更改很多,这可能不是您想要的,但是如果您按照
$html = new simple_html_dom();
$html->load_file("LINK");
foreach($html->find('div.show-summary table tbody tr') as $rowz){
foreach($rowz->find('p') as $p){
$matches = explode('<br>',$p->innertext);
foreach ($matches as $entry) {
preg_match('/<b>(.*)</b>(.*)/i', $entry, $stuff);
echo "{$stuff[1]} => $stuff[2]";
}
}
}
抱歉,您可能需要 cleanup/fiddle 才能按照您的意愿使用它。并检查 bad/undefined 个条目....
大家好,现在我有了一个完整的解决方案,我已经做了很多研究代码,它会按照我想要的方式执行这里是这个功能,请检查一下
<?php
function do_html_array($td,$dlm='<br>'){
if(!empty($td)){
$td = html_entity_decode($td);
$td = preg_replace('/<script\b[^>]*>(.*?)<\/script>/is', "", $td);
$html_array = explode($dlm,$td);
$html_key_array = array();
foreach($html_array as $key=>$html){
$html = explode(':',trim(strip_tags($html)));
if(trim($html[0])!=''){
if(count($html)<1) $html[1] = '';
if(strtolower(trim($html[0]))=='description') $html[1] = str_ireplace('[+]more','',$html[1]);
$html_key_array[strtolower(trim($html[0]))] = trim($html[1]);
switch(trim(strtolower($html[0]))){
case'external links':
preg_match_all('~<a\s+.*?</a>~is',$html_array[$key],$html_key_array['imdb_link']);
break;
case'genre':
preg_match_all('~<a\s+.*?</a>~is',$html_array[$key],$html_key_array['genre_link']);
break;
// further define here...
}
}
}
return $html_key_array;
}
return false;
}
$td = '<td valign="top" style="padding:3px"><p><b>Release Year: </b>2007<br><b>Genre: <a href="/genres/Comedy">Comedy</a></b><br><b>External Links: </b> <a target="_blank" href="http://www.imdb.com/title/tt0898266/">IMDB</a> <br><b>No. of episodes: </b> 178 episodes <br><b>Latest Episode: </b> <a href="/episode/big_bang_theory_s8_e16.html" title="Watch The Big Bang Theory Latest Episode (The Big Bang Theory Season 8 Episode 16)">Season 8 Episode 16 The Intimacy Acceleration (01/01/1970)</a><br><b>Latest Episode With Links: </b> <a href="/episode/big_bang_theory_s8_e13.html" title="Watch The Big Bang Theory Latest Episode (The Big Bang Theory Season 8 Episode 13)">Season 8 Episode 13 The Anxiety Optimization (15/01/2015)</a><br></p><div style="float: left; height: 30px; overflow: hidden; width: 100px;"><div data-show-faces="false" data-layout="button_count" data-send="false" data-href="http://watchseries.ag/serie/big_bang_theory" class="fb-like fb_iframe_widget" fb-xfbml-state="rendered" fb-iframe-plugin-query="app_id=434603673340441&href=http%3A%2F%2Fwatchseries.ag%2Fserie%2Fbig_bang_theory&layout=button_count&locale=en_US&sdk=joey&send=false&show_faces=false"><span style="vertical-align: bottom; width: 80px; height: 20px;"><iframe width="1000px" height="1000px" frameborder="0" name="f225e71df2e6d02" allowtransparency="true" scrolling="no" title="fb:like Facebook Social Plugin" style="border: medium none; visibility: visible; width: 80px; height: 20px;" src="http://www.facebook.com/plugins/like.php?app_id=434603673340441&channel=http%3A%2F%2Fstatic.ak.facebook.com%2Fconnect%2Fxd_arbiter%2FDU1Ia251o0y.js%3Fversion%3D41%23cb%3Df1f47ad29892336%26domain%3Dwatchseries.ag%26origin%3Dhttp%253A%252F%252Fwatchseries.ag%252Ff18c568fa0d51e4%26relation%3Dparent.parent&href=http%3A%2F%2Fwatchseries.ag%2Fserie%2Fbig_bang_theory&layout=button_count&locale=en_US&sdk=joey&send=false&show_faces=false" class=""></iframe></span></div></div><iframe frameborder="0" id="twitter-widget-1" scrolling="no" allowtransparency="true" src="http://platform.twitter.com/widgets/tweet_button.67ae45a68af44ab435dd5797206058d3.en.html#_=1422780550826&count=horizontal&dnt=false&id=twitter-widget-1&lang=en&original_referer=http%3A%2F%2Fwatchseries.ag%2Fserie%2Fbig_bang_theory&size=m&text=Watch%20The%20Big%20Bang%20Theory%20Serie%20Online%20-%20Watch%20Series&url=http%3A%2F%2Fwatchseries.ag%2Fserie%2Fbig_bang_theory" class="twitter-share-button twitter-tweet-button twitter-share-button twitter-count-horizontal" title="Twitter Tweet Button" data-twttr-rendered="true" style="width: 109px; height: 20px;"></iframe><script>!function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0],p=/^http:/.test(d.location)?\'http\':\'https\';if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src=p+\'://platform.twitter.com/widgets.js\';fjs.parentNode.insertBefore(js,fjs);}}(document, \'script\', \'twitter-wjs\');</script><br clear="all"><b>Description :</b> A woman who moves into an apartment across the hall from two brilliant but socially awkward physicists shows them how little they know about life outside of the laboratory.<br><p></p></td>';
$html_array = do_html_array($td);
if($html_array){
foreach($html_array as $key=>$value){
if(is_array($value)){
echo "<strong>$key</strong>:";
foreach($value[0] as $link){
echo "$link , ";
}
echo "<br>--------------------------------<br>";
}else{
echo "<strong>$key</strong>: $value";
echo "<br>--------------------------------<br>";
}
}
}
?>
我上面的函数获取所有文本并将它们保存在数组键值对中:)