如何使用 PHP 从 ul li 标签中抓取每个数据值?

How to scrape each data value from ul li tag using PHP?

我有一个包含 HTML 代码的页面,如下所示:

<ul class ='trainList'>
<li>
    <div class="smallFont farelist no-discount ">
        <div class="train-no">ABC 701</div>
        <div class="train-time">06:10<br>07:15</div>
        <div class="train-info">
            <div class="box">
                <div class="total-price">MYR 50.00</div>
                <div class="farediscount">
                    <div class="actual-fare-price">Array</div>
                    <div class="train-discount"></div>
                </div>
            </div>
</li>
<li>
    <div class="smallFont farelist no-discount ">
        <div class="train-no">ABC 701</div>
        <div class="train-time">06:10<br>07:15</div>
        <div class="train-info">
            <div class="box">
                <div class="total-price">MYR 50.00</div>
                <div class="farediscount">
                    <div class="actual-fare-price">Array</div>
                    <div class="train-discount"></div>
                </div>
            </div>
</li>

我想从上面的代码中抓取并提取火车号、火车时间和火车价格。

我的代码没有抓取我想要的信息,而是给我空白 space。我检查了很多之前发布的问题,但找不到与此类似的内容。

我的代码:

 $train_doc = new DOMDocument();

libxml_use_internal_errors(TRUE); 

if(!empty($html)){ 

  $train_doc->loadHTML($html);

  libxml_clear_errors(); 

  $train_xpath = new DOMXPath($train_doc);


  $train_list = array();

$train = $train_xpath->query('//div[@class="smallFont farelist no-discount"]');
var_dump($train);
if($train->length > 0){   


  foreach($train as $pat){

      $name = $train_xpath->query('div[@class="train-no"]', $pat)->item(0)->nodeValue;

      $train_types = array(); 
      $types = $train_xpath->query('div[@class="train-time"]/a', $pat);


      foreach($types as $type){
          $train_types[] = $type->nodeValue; 


      $train_list[] = array('name' => $name, 'types' => $train_types);

  }
}
}

echo "<pre>";
print_r($train_list);
echo "</pre>";

您需要先指向元素,先获取每个 li 然后指向那些需要的元素:

$train_list = array();
$train = $train_xpath->query('//li/div[contains(@class, "smallFont farelist no-discount")]');
if($train->length > 0) {
    foreach($train as $t) {
        $time_s =  $train_xpath->evaluate('string(./div[@class="train-time"]/text()[1])', $t);
        $time_e =  $train_xpath->evaluate('string(./div[@class="train-time"]/text()[2])', $t);
        $train_list[] = array(
            'train_no' => $train_xpath->evaluate('string(./div[@class="train-no"])', $t),
            'train_time' => "$time_s - $time_e",
            'train_price' => $train_xpath->evaluate('string(./div[@class="train-info"]/div/div[@class="total-price"])', $t),
        );
    }
}

Sample Output

libxml_use_internal_errors(true);
$page = new DOMDocument();
$page->preserveWhiteSpace = false;
$page->loadHTML($html);
$xpath = new DomXPath($page);

foreach($xpath->query("//*[contains(@class, 'train-time')]") as $element){

      print_r($element->nodeValue); 

}

希望对您有所帮助