如何使用 PHP 从 ul li 标签中抓取每个数据值?
How to scrape each data value from ul li tag using PHP?
我有一个包含 HTML 代码的页面,如下所示:
<ul class ='trainList'>
<li>
<div class="smallFont farelist no-discount ">
<div class="train-no">ABC 701</div>
<div class="train-time">06:10<br>07:15</div>
<div class="train-info">
<div class="box">
<div class="total-price">MYR 50.00</div>
<div class="farediscount">
<div class="actual-fare-price">Array</div>
<div class="train-discount"></div>
</div>
</div>
</li>
<li>
<div class="smallFont farelist no-discount ">
<div class="train-no">ABC 701</div>
<div class="train-time">06:10<br>07:15</div>
<div class="train-info">
<div class="box">
<div class="total-price">MYR 50.00</div>
<div class="farediscount">
<div class="actual-fare-price">Array</div>
<div class="train-discount"></div>
</div>
</div>
</li>
我想从上面的代码中抓取并提取火车号、火车时间和火车价格。
我的代码没有抓取我想要的信息,而是给我空白 space。我检查了很多之前发布的问题,但找不到与此类似的内容。
我的代码:
$train_doc = new DOMDocument();
libxml_use_internal_errors(TRUE);
if(!empty($html)){
$train_doc->loadHTML($html);
libxml_clear_errors();
$train_xpath = new DOMXPath($train_doc);
$train_list = array();
$train = $train_xpath->query('//div[@class="smallFont farelist no-discount"]');
var_dump($train);
if($train->length > 0){
foreach($train as $pat){
$name = $train_xpath->query('div[@class="train-no"]', $pat)->item(0)->nodeValue;
$train_types = array();
$types = $train_xpath->query('div[@class="train-time"]/a', $pat);
foreach($types as $type){
$train_types[] = $type->nodeValue;
$train_list[] = array('name' => $name, 'types' => $train_types);
}
}
}
echo "<pre>";
print_r($train_list);
echo "</pre>";
您需要先指向元素,先获取每个 li 然后指向那些需要的元素:
$train_list = array();
$train = $train_xpath->query('//li/div[contains(@class, "smallFont farelist no-discount")]');
if($train->length > 0) {
foreach($train as $t) {
$time_s = $train_xpath->evaluate('string(./div[@class="train-time"]/text()[1])', $t);
$time_e = $train_xpath->evaluate('string(./div[@class="train-time"]/text()[2])', $t);
$train_list[] = array(
'train_no' => $train_xpath->evaluate('string(./div[@class="train-no"])', $t),
'train_time' => "$time_s - $time_e",
'train_price' => $train_xpath->evaluate('string(./div[@class="train-info"]/div/div[@class="total-price"])', $t),
);
}
}
libxml_use_internal_errors(true);
$page = new DOMDocument();
$page->preserveWhiteSpace = false;
$page->loadHTML($html);
$xpath = new DomXPath($page);
foreach($xpath->query("//*[contains(@class, 'train-time')]") as $element){
print_r($element->nodeValue);
}
希望对您有所帮助
我有一个包含 HTML 代码的页面,如下所示:
<ul class ='trainList'>
<li>
<div class="smallFont farelist no-discount ">
<div class="train-no">ABC 701</div>
<div class="train-time">06:10<br>07:15</div>
<div class="train-info">
<div class="box">
<div class="total-price">MYR 50.00</div>
<div class="farediscount">
<div class="actual-fare-price">Array</div>
<div class="train-discount"></div>
</div>
</div>
</li>
<li>
<div class="smallFont farelist no-discount ">
<div class="train-no">ABC 701</div>
<div class="train-time">06:10<br>07:15</div>
<div class="train-info">
<div class="box">
<div class="total-price">MYR 50.00</div>
<div class="farediscount">
<div class="actual-fare-price">Array</div>
<div class="train-discount"></div>
</div>
</div>
</li>
我想从上面的代码中抓取并提取火车号、火车时间和火车价格。
我的代码没有抓取我想要的信息,而是给我空白 space。我检查了很多之前发布的问题,但找不到与此类似的内容。
我的代码:
$train_doc = new DOMDocument();
libxml_use_internal_errors(TRUE);
if(!empty($html)){
$train_doc->loadHTML($html);
libxml_clear_errors();
$train_xpath = new DOMXPath($train_doc);
$train_list = array();
$train = $train_xpath->query('//div[@class="smallFont farelist no-discount"]');
var_dump($train);
if($train->length > 0){
foreach($train as $pat){
$name = $train_xpath->query('div[@class="train-no"]', $pat)->item(0)->nodeValue;
$train_types = array();
$types = $train_xpath->query('div[@class="train-time"]/a', $pat);
foreach($types as $type){
$train_types[] = $type->nodeValue;
$train_list[] = array('name' => $name, 'types' => $train_types);
}
}
}
echo "<pre>";
print_r($train_list);
echo "</pre>";
您需要先指向元素,先获取每个 li 然后指向那些需要的元素:
$train_list = array();
$train = $train_xpath->query('//li/div[contains(@class, "smallFont farelist no-discount")]');
if($train->length > 0) {
foreach($train as $t) {
$time_s = $train_xpath->evaluate('string(./div[@class="train-time"]/text()[1])', $t);
$time_e = $train_xpath->evaluate('string(./div[@class="train-time"]/text()[2])', $t);
$train_list[] = array(
'train_no' => $train_xpath->evaluate('string(./div[@class="train-no"])', $t),
'train_time' => "$time_s - $time_e",
'train_price' => $train_xpath->evaluate('string(./div[@class="train-info"]/div/div[@class="total-price"])', $t),
);
}
}
libxml_use_internal_errors(true);
$page = new DOMDocument();
$page->preserveWhiteSpace = false;
$page->loadHTML($html);
$xpath = new DomXPath($page);
foreach($xpath->query("//*[contains(@class, 'train-time')]") as $element){
print_r($element->nodeValue);
}
希望对您有所帮助