具有输出缓冲的多个外部脚本
Multiple external scripts with output buffering
正在尝试创建调用 4 个外部 php 脚本的批处理作业。我可以使用 curl_exec 执行此操作,但它会等待所有 4 个脚本完成,然后再将文本发送到屏幕。我相信这可以优化,但不能真正让它发挥作用。
关于如何使用 curl_multi_getcontent + 输出缓冲来执行此操作的任何建议,以便它显示从外部站点获取的数据?
curl_multi确实。
Any suggestions on how to do this with curl_multi_getcontent + output buffering so it displays the data as it get's it from the external sites?
好吧,真的没有必要缓冲它。打印对 stdout 的响应是 curl 的 默认行为 无论如何,所以如果你只是想把它打印到终端,你根本不需要添加任何打印代码:)
$mh=curl_multi_init();
$urls=array(
"url1",
"url2",
"url3",
"url4"
);
$curls=array();
foreach($urls as $url){
$ch=curl_init($url);
curl_multi_add_handle($mh,$ch);
$curls[]=$ch;
}
for(;;){
curl_multi_exec($mh,$active);
if($active<1){
// all downloads finished
break;
}
curl_multi_select($mh);
}
// cleanup
foreach($curls as $ch){
curl_multi_remove_handle($mh,$ch);
curl_close($ch);
}
curl_multi_close($mh);
请注意,此方法仅在您拥有 small-ish 个 URL 时才安全。当你访问 ~100 个 url 或更多时,你可能应该考虑排队,这样你就不会得到 auto-firewall-banned-for-ddos,例如检查 https://gist.github.com/divinity76/79efd7b8c0d7849b956cd194659c98e5 中的 function curl_fetch_multi_2(array $urls_unique, int $max_connections = 100, array $additional_curlopts = null)
,该函数将永远不会获取同时超过$max_connections = 100
(可配置,默认100)个网址:)
- 编辑:根据评论,如果你出于某种原因想要缓冲输出,它会变得更复杂,但你仍然可以做到,例如:
<?php
$mh = curl_multi_init();
$urls = array(
"https://example.com",
"https://example.net",
"https://example.org",
);
$curls = array();
foreach ($urls as $url) {
$ch = curl_init($url);
curl_setopt_array($ch, array(
CURLOPT_ENCODING => '',
CURLOPT_RETURNTRANSFER => 1
));
curl_multi_add_handle($mh, $ch);
$curls[(int)$ch] = $ch;
}
while (!empty($curls)) {
for (;;) {
curl_multi_exec($mh, $active);
if ($active < count($curls)) {
// at least 1 download has finished, process it
break;
}
curl_multi_select($mh);
}
while (($info = curl_multi_info_read($mh))) {
if ($info['msg'] !== CURLMSG_DONE) {
continue;
}
$ch = $info['handle'];
$url = curl_getinfo($ch, CURLINFO_EFFECTIVE_URL);
$response = curl_multi_getcontent($ch);
$cap_response = true;
if ($cap_response) {
$response = substr($response, 0, 50);
}
if ($info['result'] !== CURLE_OK) {
echo "Error: {$url}: " . curl_error($ch) . "\n";
}
echo "download finished: " . var_export(["url" => $url, "response" => $response], true) . "\n";
// cleanup
unset($curls[(int)$ch]);
curl_multi_remove_handle($mh, $ch);
curl_close($ch);
}
}
// cleanup
curl_multi_close($mh);
此代码打印
$ php fuk.php
download finished: array (
'url' => 'https://example.com/',
'response' => '<!doctype html>
<html>
<head>
<title>Example D',
)
download finished: array (
'url' => 'https://example.net/',
'response' => '<!doctype html>
<html>
<head>
<title>Example D',
)
download finished: array (
'url' => 'https://example.org/',
'response' => '<!doctype html>
<html>
<head>
<title>Example D',
)
并行下载example.com和example.net和example.org并缓冲输出:)
正在尝试创建调用 4 个外部 php 脚本的批处理作业。我可以使用 curl_exec 执行此操作,但它会等待所有 4 个脚本完成,然后再将文本发送到屏幕。我相信这可以优化,但不能真正让它发挥作用。 关于如何使用 curl_multi_getcontent + 输出缓冲来执行此操作的任何建议,以便它显示从外部站点获取的数据?
curl_multi确实。
Any suggestions on how to do this with curl_multi_getcontent + output buffering so it displays the data as it get's it from the external sites?
好吧,真的没有必要缓冲它。打印对 stdout 的响应是 curl 的 默认行为 无论如何,所以如果你只是想把它打印到终端,你根本不需要添加任何打印代码:)
$mh=curl_multi_init();
$urls=array(
"url1",
"url2",
"url3",
"url4"
);
$curls=array();
foreach($urls as $url){
$ch=curl_init($url);
curl_multi_add_handle($mh,$ch);
$curls[]=$ch;
}
for(;;){
curl_multi_exec($mh,$active);
if($active<1){
// all downloads finished
break;
}
curl_multi_select($mh);
}
// cleanup
foreach($curls as $ch){
curl_multi_remove_handle($mh,$ch);
curl_close($ch);
}
curl_multi_close($mh);
请注意,此方法仅在您拥有 small-ish 个 URL 时才安全。当你访问 ~100 个 url 或更多时,你可能应该考虑排队,这样你就不会得到 auto-firewall-banned-for-ddos,例如检查 https://gist.github.com/divinity76/79efd7b8c0d7849b956cd194659c98e5 中的 function curl_fetch_multi_2(array $urls_unique, int $max_connections = 100, array $additional_curlopts = null)
,该函数将永远不会获取同时超过$max_connections = 100
(可配置,默认100)个网址:)
- 编辑:根据评论,如果你出于某种原因想要缓冲输出,它会变得更复杂,但你仍然可以做到,例如:
<?php
$mh = curl_multi_init();
$urls = array(
"https://example.com",
"https://example.net",
"https://example.org",
);
$curls = array();
foreach ($urls as $url) {
$ch = curl_init($url);
curl_setopt_array($ch, array(
CURLOPT_ENCODING => '',
CURLOPT_RETURNTRANSFER => 1
));
curl_multi_add_handle($mh, $ch);
$curls[(int)$ch] = $ch;
}
while (!empty($curls)) {
for (;;) {
curl_multi_exec($mh, $active);
if ($active < count($curls)) {
// at least 1 download has finished, process it
break;
}
curl_multi_select($mh);
}
while (($info = curl_multi_info_read($mh))) {
if ($info['msg'] !== CURLMSG_DONE) {
continue;
}
$ch = $info['handle'];
$url = curl_getinfo($ch, CURLINFO_EFFECTIVE_URL);
$response = curl_multi_getcontent($ch);
$cap_response = true;
if ($cap_response) {
$response = substr($response, 0, 50);
}
if ($info['result'] !== CURLE_OK) {
echo "Error: {$url}: " . curl_error($ch) . "\n";
}
echo "download finished: " . var_export(["url" => $url, "response" => $response], true) . "\n";
// cleanup
unset($curls[(int)$ch]);
curl_multi_remove_handle($mh, $ch);
curl_close($ch);
}
}
// cleanup
curl_multi_close($mh);
此代码打印
$ php fuk.php
download finished: array (
'url' => 'https://example.com/',
'response' => '<!doctype html>
<html>
<head>
<title>Example D',
)
download finished: array (
'url' => 'https://example.net/',
'response' => '<!doctype html>
<html>
<head>
<title>Example D',
)
download finished: array (
'url' => 'https://example.org/',
'response' => '<!doctype html>
<html>
<head>
<title>Example D',
)
并行下载example.com和example.net和example.org并缓冲输出:)