具有输出缓冲的多个外部脚本

Multiple external scripts with output buffering

正在尝试创建调用 4 个外部 php 脚本的批处理作业。我可以使用 curl_exec 执行此操作,但它会等待所有 4 个脚本完成,然后再将文本发送到屏幕。我相信这可以优化,但不能真正让它发挥作用。 关于如何使用 curl_multi_getcontent + 输出缓冲来执行此操作的任何建议,以便它显示从外部站点获取的数据?

curl_multi确实。

Any suggestions on how to do this with curl_multi_getcontent + output buffering so it displays the data as it get's it from the external sites?

好吧,真的没有必要缓冲它。打印对 stdout 的响应是 curl 的 默认行为 无论如何,所以如果你只是想把它打印到终端,你根本不需要添加任何打印代码:)

$mh=curl_multi_init();
$urls=array(
"url1",
"url2",
"url3",
"url4"
);
$curls=array();
foreach($urls as $url){
    $ch=curl_init($url);
    curl_multi_add_handle($mh,$ch);
    $curls[]=$ch;
}
for(;;){
curl_multi_exec($mh,$active);
if($active<1){
 // all downloads finished
 break;
}
curl_multi_select($mh);
}
// cleanup
foreach($curls as $ch){
    curl_multi_remove_handle($mh,$ch);
    curl_close($ch);
}
curl_multi_close($mh);

请注意,此方法仅在您拥有 small-ish 个 URL 时才安全。当你访问 ~100 个 url 或更多时,你可能应该考虑排队,这样你就不会得到 auto-firewall-banned-for-ddos,例如检查 https://gist.github.com/divinity76/79efd7b8c0d7849b956cd194659c98e5 中的 function curl_fetch_multi_2(array $urls_unique, int $max_connections = 100, array $additional_curlopts = null) ,该函数将永远不会获取同时超过$max_connections = 100(可配置,默认100)个网址:)

  • 编辑:根据评论,如果你出于某种原因想要缓冲输出,它会变得更复杂,但你仍然可以做到,例如:
<?php
$mh = curl_multi_init();
$urls = array(
    "https://example.com",
    "https://example.net",
    "https://example.org",
);
$curls = array();
foreach ($urls as $url) {
    $ch = curl_init($url);
    curl_setopt_array($ch, array(
        CURLOPT_ENCODING => '',
        CURLOPT_RETURNTRANSFER => 1
    ));
    curl_multi_add_handle($mh, $ch);
    $curls[(int)$ch] = $ch;
}
while (!empty($curls)) {
    for (;;) {
        curl_multi_exec($mh, $active);
        if ($active < count($curls)) {
            // at least 1 download has finished, process it
            break;
        }
        curl_multi_select($mh);
    }
    while (($info = curl_multi_info_read($mh))) {
        if ($info['msg'] !== CURLMSG_DONE) {
            continue;
        }
        $ch = $info['handle'];
        $url = curl_getinfo($ch, CURLINFO_EFFECTIVE_URL);
        $response = curl_multi_getcontent($ch);
        $cap_response = true;
        if ($cap_response) {
            $response = substr($response, 0, 50);
        }
        if ($info['result'] !== CURLE_OK) {
            echo "Error: {$url}: " . curl_error($ch) . "\n";
        }
        echo "download finished: " . var_export(["url" => $url, "response" => $response], true) . "\n";
        // cleanup
        unset($curls[(int)$ch]);
        curl_multi_remove_handle($mh, $ch);
        curl_close($ch);
    }
}
// cleanup
curl_multi_close($mh);

此代码打印

$ php fuk.php
download finished: array (
  'url' => 'https://example.com/',
  'response' => '<!doctype html>
<html>
<head>
    <title>Example D',
)
download finished: array (
  'url' => 'https://example.net/',
  'response' => '<!doctype html>
<html>
<head>
    <title>Example D',
)
download finished: array (
  'url' => 'https://example.org/',
  'response' => '<!doctype html>
<html>
<head>
    <title>Example D',
)

并行下载example.com和example.net和example.org并缓冲输出:)