Goutte Client如何存储和检索$crawler?
Goutte Client how to store and retrive $crawler?
我的代码是这样的
<?php
require_once 'vendor/autoload.php';
use Goutte\Client;
use Symfony\Component\HttpClient\HttpClient;
//generate random string
function generateRandomString($length = 10)
{
$characters = '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ';
$charactersLength = strlen($characters);
$randomString = '';
for ($i = 0; $i < $length; $i++) {
$randomString .= $characters[rand(0, $charactersLength - 1)];
}
return $randomString;
}
//creating Goutte Client
$client = new Client(HttpClient::create(array(
'headers' => array(
'Accept' => 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Language' => 'en-US,en;q=0.5',
'Connection' => 'keep-alive',
),
)));
//Request
$crawler = $client->request('GET', 'example.com/login');
$session_id = generateRandomString(15);
//For write php object in the text file...
$objData = serialize($crawler);
$filePath = getcwd() . DIRECTORY_SEPARATOR . "sessions" . DIRECTORY_SEPARATOR . "obj" . $session_id . ".txt";
$fp = fopen($filePath, "w");
fwrite($fp, $objData);
fclose($fp);
//To read the text file to get the object
$crawler_new = file_get_contents(getcwd() . DIRECTORY_SEPARATOR . "sessions" . DIRECTORY_SEPARATOR . "obj" . $session_id . ".txt");
$obj = unserialize($crawler_new);
print_r($obj);
die();
上面的代码结果如下
Warning: print_r(): Invalid State Error in C:\xampp\htdocs\verisys\index.php on line 80
Warning: print_r(): Invalid State Error in C:\xampp\htdocs\verisys\index.php on line 80
Warning: print_r(): Invalid State Error in C:\xampp\htdocs\verisys\index.php on line 80
Warning: print_r(): Invalid State Error in C:\xampp\htdocs\verisys\index.php on line 80
Warning: print_r(): Invalid State Error in C:\xampp\htdocs\verisys\index.php on line 80
.
.
.
Warning: print_r(): Invalid State Error in C:\xampp\htdocs\verisys\index.php on line 80
Symfony\Component\DomCrawler\Crawler Object
(
[uri:protected] => example.com/login/
[defaultNamespacePrefix:Symfony\Component\DomCrawler\Crawler:private] => default
[namespaces:Symfony\Component\DomCrawler\Crawler:private] => Array
(
)
[baseHref:Symfony\Component\DomCrawler\Crawler:private] => example.com/login/
[document:Symfony\Component\DomCrawler\Crawler:private] => DOMDocument Object
(
[implementation] => (object value omitted)
[strictErrorChecking] =>
[config] =>
[formatOutput] =>
[validateOnParse] =>
[resolveExternals] =>
[preserveWhiteSpace] =>
[recover] =>
[substituteEntities] =>
)
[nodes:Symfony\Component\DomCrawler\Crawler:private] => Array
(
[0] => DOMElement Object
(
[schemaTypeInfo] =>
)
)
[isHtml:Symfony\Component\DomCrawler\Crawler:private] => 1
[html5Parser:Symfony\Component\DomCrawler\Crawler:private] =>
)
任何人都可以帮助我将 $crawler 对象存储在文件中??
基本上是想要求客户人工输入 reCaptcha。
我正在开发一个项目,在该项目中,我想使用 Goutte 通过服务器执行所有流程,为此在登录页面上应用了 reCaptcha,我想由客户端填写,然后继续其他流程。
再次创建相同的客户端:
$cokie = "JSESSIONID=0000H_WHw_eFPKVUDGxUei7v3PH:1db7cfi4s";
$client = new Client(HttpClient::create(array(
'headers' => array(
'Accept' => 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Language' => 'en-US,en;q=0.5',
'Connection' => 'keep-alive',
'Host' => 'verification.nadra.gov.pk',
"Cookie" => $cokie,
'User-Agent' => 'Mozilla/5.0 (Windows NT x.y; Win64; x64; rv:10.0) Gecko/20100101 Firefox/10.0'
),
)));
$cookie = new Cookie("JSESSIONID", $cokie, null, "/service", "https://example.com/", true, true);
$client->getCookieJar()->set($cookie);
$client->setServerParameter('HTTP_USER_AGENT', 'Mozilla/5.0 (Windows NT x.y; Win64; x64; rv:10.0) Gecko/20100101 Firefox/10.0');
$client->followRedirects(true);
$crawler = $client->request('GET', 'https://example.com/service/botdetectcaptcha?get=image&c=exampleCaptcha&t=508c5eaf74fd4858b0c9debafc319d67');
我的代码是这样的
<?php
require_once 'vendor/autoload.php';
use Goutte\Client;
use Symfony\Component\HttpClient\HttpClient;
//generate random string
function generateRandomString($length = 10)
{
$characters = '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ';
$charactersLength = strlen($characters);
$randomString = '';
for ($i = 0; $i < $length; $i++) {
$randomString .= $characters[rand(0, $charactersLength - 1)];
}
return $randomString;
}
//creating Goutte Client
$client = new Client(HttpClient::create(array(
'headers' => array(
'Accept' => 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Language' => 'en-US,en;q=0.5',
'Connection' => 'keep-alive',
),
)));
//Request
$crawler = $client->request('GET', 'example.com/login');
$session_id = generateRandomString(15);
//For write php object in the text file...
$objData = serialize($crawler);
$filePath = getcwd() . DIRECTORY_SEPARATOR . "sessions" . DIRECTORY_SEPARATOR . "obj" . $session_id . ".txt";
$fp = fopen($filePath, "w");
fwrite($fp, $objData);
fclose($fp);
//To read the text file to get the object
$crawler_new = file_get_contents(getcwd() . DIRECTORY_SEPARATOR . "sessions" . DIRECTORY_SEPARATOR . "obj" . $session_id . ".txt");
$obj = unserialize($crawler_new);
print_r($obj);
die();
上面的代码结果如下
Warning: print_r(): Invalid State Error in C:\xampp\htdocs\verisys\index.php on line 80
Warning: print_r(): Invalid State Error in C:\xampp\htdocs\verisys\index.php on line 80
Warning: print_r(): Invalid State Error in C:\xampp\htdocs\verisys\index.php on line 80
Warning: print_r(): Invalid State Error in C:\xampp\htdocs\verisys\index.php on line 80
Warning: print_r(): Invalid State Error in C:\xampp\htdocs\verisys\index.php on line 80
.
.
.
Warning: print_r(): Invalid State Error in C:\xampp\htdocs\verisys\index.php on line 80
Symfony\Component\DomCrawler\Crawler Object
(
[uri:protected] => example.com/login/
[defaultNamespacePrefix:Symfony\Component\DomCrawler\Crawler:private] => default
[namespaces:Symfony\Component\DomCrawler\Crawler:private] => Array
(
)
[baseHref:Symfony\Component\DomCrawler\Crawler:private] => example.com/login/
[document:Symfony\Component\DomCrawler\Crawler:private] => DOMDocument Object
(
[implementation] => (object value omitted)
[strictErrorChecking] =>
[config] =>
[formatOutput] =>
[validateOnParse] =>
[resolveExternals] =>
[preserveWhiteSpace] =>
[recover] =>
[substituteEntities] =>
)
[nodes:Symfony\Component\DomCrawler\Crawler:private] => Array
(
[0] => DOMElement Object
(
[schemaTypeInfo] =>
)
)
[isHtml:Symfony\Component\DomCrawler\Crawler:private] => 1
[html5Parser:Symfony\Component\DomCrawler\Crawler:private] =>
)
任何人都可以帮助我将 $crawler 对象存储在文件中?? 基本上是想要求客户人工输入 reCaptcha。 我正在开发一个项目,在该项目中,我想使用 Goutte 通过服务器执行所有流程,为此在登录页面上应用了 reCaptcha,我想由客户端填写,然后继续其他流程。
再次创建相同的客户端:
$cokie = "JSESSIONID=0000H_WHw_eFPKVUDGxUei7v3PH:1db7cfi4s";
$client = new Client(HttpClient::create(array(
'headers' => array(
'Accept' => 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Language' => 'en-US,en;q=0.5',
'Connection' => 'keep-alive',
'Host' => 'verification.nadra.gov.pk',
"Cookie" => $cokie,
'User-Agent' => 'Mozilla/5.0 (Windows NT x.y; Win64; x64; rv:10.0) Gecko/20100101 Firefox/10.0'
),
)));
$cookie = new Cookie("JSESSIONID", $cokie, null, "/service", "https://example.com/", true, true);
$client->getCookieJar()->set($cookie);
$client->setServerParameter('HTTP_USER_AGENT', 'Mozilla/5.0 (Windows NT x.y; Win64; x64; rv:10.0) Gecko/20100101 Firefox/10.0');
$client->followRedirects(true);
$crawler = $client->request('GET', 'https://example.com/service/botdetectcaptcha?get=image&c=exampleCaptcha&t=508c5eaf74fd4858b0c9debafc319d67');