php cUrl 在 betfair 上被屏蔽了?
php cUrl blocked on betfair?
我正在尝试使用此代码 php:
在 betfair.com 站点上进行网络抓取
<?php
// Defining the basic cURL function
function curl($url) {
$ch = curl_init(); // Initialising cURL
curl_setopt($ch, CURLOPT_URL, $url); // Setting cURL's URL option with the $url variable passed into the function
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE); // Setting cURL's option to return the webpage data
$data = curl_exec($ch); // Executing the cURL request and assigning the returned data to the $data variable
curl_close($ch); // Closing cURL
return $data; // Returning the data from the function
}
$scraped_website = curl("https://www.betfair.com/exchange/football");
echo $scraped_website;
?>
这种方式的代码有效。
但是如果不是“https://www.betfair.com/exchange/football" choose "https://www.betfair.com/exchange/football/event?id=28040884”
代码停止工作。
请帮忙。
查看 headers curl 接收:
HTTP/1.1 302 Moved Temporarily
Location: https://www.betfair.com/exchange/plus/#/football/event/28040884
Cache-Control: no-cache
Pragma: no-cache
Date: Fri, 09 Dec 2016 17:38:52 GMT
Age: 0
Transfer-Encoding: chunked
Connection: keep-alive
Server: ATS/5.2.1
Set-Cookie: vid=00956994-084c-444b-ad26-38b1119f4e38; Domain=.betfair.com; Expires=Mon, 01-Dec-2022 09:00:00 GMT; Path=/
X-Opaque-UUID: 80506a77-12c1-4c89-b4a6-fa499fd23895
实际上 https://www.betfair.com/exchange/football/event?id=28040884 发送了一个 302 Moved Temporarily HTTP 重定向,而您的脚本不遵循重定向,这就是它不起作用的原因。解决这个问题(使用 CURLOPT_FOLLOWLOCATION),你的代码工作正常。固定码:
function curl($url) {
$ch = curl_init(); // Initialising cURL
curl_setopt($ch, CURLOPT_URL, $url); // Setting cURL's URL option with the $url variable passed into the function
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE); // Setting cURL's option to return the webpage data
curl_setopt($ch, CURLOPT_VERBOSE, TRUE);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
$data = curl_exec($ch); // Executing the cURL request and assigning the returned data to the $data variable
curl_close($ch); // Closing cURL
return $data; // Returning the data from the function
}
var_dump(curl("https://www.betfair.com/exchange/football/event?id=28040884"));
(我还建议使用 CURLOPT_ENCODING=>'' ,如果支持,这将使 curl 使用压缩传输,并且 HTML 使用 gzip 压缩真的非常好,通常将 curl 编译为支持,这使得站点下载速度更快,这使得 curl_exec() return 更快)
我正在尝试使用此代码 php:
在 betfair.com 站点上进行网络抓取<?php
// Defining the basic cURL function
function curl($url) {
$ch = curl_init(); // Initialising cURL
curl_setopt($ch, CURLOPT_URL, $url); // Setting cURL's URL option with the $url variable passed into the function
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE); // Setting cURL's option to return the webpage data
$data = curl_exec($ch); // Executing the cURL request and assigning the returned data to the $data variable
curl_close($ch); // Closing cURL
return $data; // Returning the data from the function
}
$scraped_website = curl("https://www.betfair.com/exchange/football");
echo $scraped_website;
?>
这种方式的代码有效。
但是如果不是“https://www.betfair.com/exchange/football" choose "https://www.betfair.com/exchange/football/event?id=28040884” 代码停止工作。
请帮忙。
查看 headers curl 接收:
HTTP/1.1 302 Moved Temporarily
Location: https://www.betfair.com/exchange/plus/#/football/event/28040884
Cache-Control: no-cache
Pragma: no-cache
Date: Fri, 09 Dec 2016 17:38:52 GMT
Age: 0
Transfer-Encoding: chunked
Connection: keep-alive
Server: ATS/5.2.1
Set-Cookie: vid=00956994-084c-444b-ad26-38b1119f4e38; Domain=.betfair.com; Expires=Mon, 01-Dec-2022 09:00:00 GMT; Path=/
X-Opaque-UUID: 80506a77-12c1-4c89-b4a6-fa499fd23895
实际上 https://www.betfair.com/exchange/football/event?id=28040884 发送了一个 302 Moved Temporarily HTTP 重定向,而您的脚本不遵循重定向,这就是它不起作用的原因。解决这个问题(使用 CURLOPT_FOLLOWLOCATION),你的代码工作正常。固定码:
function curl($url) {
$ch = curl_init(); // Initialising cURL
curl_setopt($ch, CURLOPT_URL, $url); // Setting cURL's URL option with the $url variable passed into the function
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE); // Setting cURL's option to return the webpage data
curl_setopt($ch, CURLOPT_VERBOSE, TRUE);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
$data = curl_exec($ch); // Executing the cURL request and assigning the returned data to the $data variable
curl_close($ch); // Closing cURL
return $data; // Returning the data from the function
}
var_dump(curl("https://www.betfair.com/exchange/football/event?id=28040884"));
(我还建议使用 CURLOPT_ENCODING=>'' ,如果支持,这将使 curl 使用压缩传输,并且 HTML 使用 gzip 压缩真的非常好,通常将 curl 编译为支持,这使得站点下载速度更快,这使得 curl_exec() return 更快)