如何检查 Google 中是否有更多网站被编入索引

How to check if more site is indexed in Google

如果站点已编入索引,我在 google 上找到了一个脚本检查索引

function getPagesIndexedGoogle($site)
{
    if ($site) {
        $curl = curl_init();
        curl_setopt_array($curl, array(
            CURLOPT_HEADER => 0,
            CURLOPT_RETURNTRANSFER => 1,
            CURLOPT_URL => "https://www.google.com.au/search?q=site:$site&gws_rd=ssl",
            CURLOPT_SSL_VERIFYPEER=> false,
            CURLOPT_USERAGENT => 'Mozilla/5.0 (Windows NT 6.3; Trident/7.0; rv:11.0) like Gecko'
        ));
        $result_string = curl_exec($curl);
        curl_close($curl);
        if (strpos($result_string, "did not match any documents") !== false) {
            return 0;
        } else {
            $match = preg_match("/about ([0-9,]{0,12})/i", $result_string, $matches);

            echo $matches[1];
        }
    }
}

if($_POST['domain']){
    $site = $_POST['domain'];
}
echo  $_POST['domain'] ;
echo getPagesIndexedGoogle($site);    

?>

如何检查多个url?

我用过 Foreach 但它不起作用。请帮助我。

按照要求,这应该可以解决问题:

<?php

function getPagesIndexedGoogle($site)
{
    if ($site) {
        $curl = curl_init();
        curl_setopt_array($curl, array(
            CURLOPT_HEADER => 0,
            CURLOPT_RETURNTRANSFER => 1,
            CURLOPT_URL => "https://www.google.com.au/search?q=site:$site&amp;gws_rd=ssl",
            CURLOPT_SSL_VERIFYPEER=> false,
            CURLOPT_USERAGENT => 'Mozilla/5.0 (Windows NT 6.3; Trident/7.0; rv:11.0) like Gecko'
        ));
        $result_string = curl_exec($curl);
        curl_close($curl);
        if (strpos($result_string, "did not match any documents") !== false) {
            return 0;
        } else {
            $match = preg_match("/about ([0-9,]{0,12})/i", $result_string, $matches);

            echo $matches[1];
        }
    }
}

if(!empty($_POST['domain'])){

    // Tries to split URLs by new line or space character
    $sites = preg_split('#\r\n|\r|\n| #i', $_POST['domain']);

    foreach($sites as $site){
        // Checks if the URL is a valid website or not (http(s):// must be included!)
        if(preg_match('#((https?|ftp):\/\/(\S*?\.\S*?))([\s)\[\]{},;"\':<]|\.\s|$)#i', $site)){
            echo $site;
            echo getPagesIndexedGoogle($site);
        } else {
            echo $site;
            echo " is not a valid url.";
        }
    }

} else {
    echo "No websites were entered.";
}

?>

我仍然建议您不要使用文本区域,而是使用简单的文本字段。它只是不太容易出错。