PHP 简单 HTML Dom 不解析某些链接
PHP Simple HTML Dom not parsing certain links
我正在了解 HTML DOM 解析器及其工作原理。我有一个障碍,我无法解析以下 link 但我能够解析根域和其他网站。谁能帮我理解为什么我无法解析这个特定的 link?
<?php
include('simple_html_dom.php');
$base = 'http://www.stupidstudios.com/samsung-galaxy-s6/p/bbuynow';
$curl = curl_init();
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($curl, CURLOPT_HEADER, false);
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($curl, CURLOPT_URL, $base);
curl_setopt($curl, CURLOPT_REFERER, $base);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, TRUE);
$str = curl_exec($curl);
curl_close($curl);
$html_base = new simple_html_dom();
$html_base->load($str);
foreach($html_base->find('h1') as $element) {
echo "<pre>";
print_r( $element );
echo "</pre>";
}
$html_base->clear();
unset($html_base);
?>
当您 add/spoof 使用浏览器代理时,它似乎可以工作:
$base = 'http://www.flipkart.com/samsung-galaxy-s6/p/itme5z4aypvtrxmy';
$curl = curl_init($base);
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($curl,CURLOPT_USERAGENT,'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13');
curl_setopt($curl, CURLOPT_RETURNTRANSFER, TRUE);
$str = curl_exec($curl);
curl_close($curl);
echo $str;
我正在了解 HTML DOM 解析器及其工作原理。我有一个障碍,我无法解析以下 link 但我能够解析根域和其他网站。谁能帮我理解为什么我无法解析这个特定的 link?
<?php
include('simple_html_dom.php');
$base = 'http://www.stupidstudios.com/samsung-galaxy-s6/p/bbuynow';
$curl = curl_init();
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($curl, CURLOPT_HEADER, false);
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($curl, CURLOPT_URL, $base);
curl_setopt($curl, CURLOPT_REFERER, $base);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, TRUE);
$str = curl_exec($curl);
curl_close($curl);
$html_base = new simple_html_dom();
$html_base->load($str);
foreach($html_base->find('h1') as $element) {
echo "<pre>";
print_r( $element );
echo "</pre>";
}
$html_base->clear();
unset($html_base);
?>
当您 add/spoof 使用浏览器代理时,它似乎可以工作:
$base = 'http://www.flipkart.com/samsung-galaxy-s6/p/itme5z4aypvtrxmy';
$curl = curl_init($base);
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($curl,CURLOPT_USERAGENT,'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13');
curl_setopt($curl, CURLOPT_RETURNTRANSFER, TRUE);
$str = curl_exec($curl);
curl_close($curl);
echo $str;