使用 cURL 和 PHP 登录 Facebook

Facebook Login using cURL and PHP

我正在尝试使用 curl 访问 facebook 登录页面。我的意图是登录到 facebook,然后做一些 scaping。由于最新的限制,我没有使用 facebook API...我需要收集帖子上的评论,而仅使用 API.

是不可能的

这是我的一些代码:

curl_setopt($ch, CURLOPT_URL,"https://web.facebook.com");
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
$response = curl_exec($ch);
curl_close($ch);
echo $response;

我希望它重定向到登录页面,然后当用户填写登录表单时,我会获取凭据并使用它们重定向到主页并开始抓取。

无论如何,这就是我得到的:

(ps,我是程序作者)this program logs into facebook to send messages. the login code can be found here,登录过程在构造函数中完成,

但要点是您需要先执行 GET 请求以获取 cookie 和 csrf 令牌以及一些东西,将其解析为 lgoin 形式,然后 post 返回application/x-www-form-urlencoded POST 请求与用户名和密码一起登录 url 特定于您的 cookie 会话,url 您还必须从 html 在第一个 GET 请求中收到。

使用暗示你有糟糕 javascript 支持的用户代理也符合你的最佳利益(因为在现实中,PHP,你有 none .),该代码使用的一个示例是 'Mozilla/5.0 (BlackBerry; U; BlackBerry 9300; en) AppleWebKit/534.8+ (KHTML, like Gecko) Version/6.0.0.570 Mobile Safari/534.8+'(又名旧黑莓 phone)

  • 现在,如果您使用智能phone用户代理,它可能有时要求您安装智能phone应用程序,如果您得到那个问题,它不会让你完成登录,直到你回答是或否,所以你需要添加代码来检测这个问题,如果存在并回答它,你可以用 XPath "//a[contains(@href,'/login/save-device/cancel/')]" 检测这个问题和 protip,确认您成功登录的一个好方法是寻找注销按钮,在 XPath 中看起来像 //a[contains(@href,"/logout.php")]

代码中最相关的部分是:

function __construct() {
    $this->recipientID = \MsgMe\getUserOption ( 'Facebook', 'recipientID', NULL );
    if (NULL === $this->recipientID) {
        throw new \Exception ( 'Error: cannot find [Facebook] recipientID option!' );
    }
    $this->email = \MsgMe\getUserOption ( 'Facebook', 'email', NULL );
    if (NULL === $this->email) {
        throw new \Exception ( 'Error: cannot find [Facebook] email option!' );
    }
    $this->password = \MsgMe\getUserOption ( 'Facebook', 'password', NULL );
    if (NULL === $this->password) {
        throw new \Exception ( 'Error: cannot find [Facebook] password option!' );
    }
    $this->hc = new \hhb_curl ();
    $hc = &$this->hc;
    $hc->_setComfortableOptions ();
    $hc->setopt_array ( array (
            CURLOPT_USERAGENT => 'Mozilla/5.0 (BlackBerry; U; BlackBerry 9300; en) AppleWebKit/534.8+ (KHTML, like Gecko) Version/6.0.0.570 Mobile Safari/534.8+',
            CURLOPT_HTTPHEADER => array (
                    'accept-language:en-US,en;q=0.8' 
            ) 
    ) );
    $hc->exec ( 'https://m.facebook.com/' );
    // \hhb_var_dump ( $hc->getStdErr (), $hc->getStdOut () ) & die ();
    $domd = @\DOMDocument::loadHTML ( $hc->getResponseBody () );    
    $form = (\MsgMe\tools\getDOMDocumentFormInputs ( $domd, true )) ['login_form'];
    $url = $domd->getElementsByTagName ( "form" )->item ( 0 )->getAttribute ( "action" );
    $postfields = (function () use (&$form): array {
        $ret = array ();
        foreach ( $form as $input ) {
            $ret [$input->getAttribute ( "name" )] = $input->getAttribute ( "value" );
        }
        return $ret;
    });
    $postfields = $postfields (); // sorry about that, eclipse can't handle IIFE syntax.
    assert ( array_key_exists ( 'email', $postfields ) );
    assert ( array_key_exists ( 'pass', $postfields ) );
    $postfields ['email'] = $this->email;
    $postfields ['pass'] = $this->password;
    $hc->setopt_array ( array (
            CURLOPT_POST => true,
            CURLOPT_POSTFIELDS => http_build_query ( $postfields ),
            CURLOPT_HTTPHEADER => array (
                    'accept-language:en-US,en;q=0.8' 
            ) 
    ) );
    // \hhb_var_dump ($postfields ) & die ();
    $hc->exec ( $url );
    // \hhb_var_dump ( $hc->getStdErr (), $hc->getStdOut () ) & die ();

    $domd = @\DOMDocument::loadHTML ( $hc->getResponseBody () );
    $xp = new \DOMXPath ( $domd );
    $InstallFacebookAppRequest = $xp->query ( "//a[contains(@href,'/login/save-device/cancel/')]" );
    if ($InstallFacebookAppRequest->length > 0) {
        // not all accounts get this, but some do, not sure why, anyway, if this exist, fb is asking "ey wanna install the fb app instead of using the website?"
        // and won't let you proceed further until you say yes or no. so we say no.
        $url = 'https://m.facebook.com' . $InstallFacebookAppRequest->item ( 0 )->getAttribute ( "href" );
        $hc->exec ( $url );
        $domd = @\DOMDocument::loadHTML ( $hc->getResponseBody () );
        $xp = new \DOMXPath ( $domd );
    }
    unset ( $InstallFacebookAppRequest, $url );
    $urlinfo = parse_url ( $hc->getinfo ( CURLINFO_EFFECTIVE_URL ) );
    $a = $xp->query ( '//a[contains(@href,"/logout.php")]' );
    if ($a->length < 1) {
        $debuginfo = $hc->getStdErr () . $hc->getStdOut ();
        $tmp = tmpfile ();
        fwrite ( $tmp, $debuginfo );
        $debuginfourl = shell_exec ( "cat " . escapeshellarg ( stream_get_meta_data ( $tmp ) ['uri'] ) . " | pastebinit" );
        fclose ( $tmp );
        throw new \RuntimeException ( 'failed to login to facebook! apparently... cannot find the logout url!  debuginfo url: ' . $debuginfourl );
    }
    $a = $a->item ( 0 );
    $url = $urlinfo ['scheme'] . '://' . $urlinfo ['host'] . $a->getAttribute ( "href" );
    $this->logoutUrl = $url;
    // all initialized, ready to sendMessage();
}