在 Laravel 5.2 中使用 DomCrawler 获取表单
Get forms with DomCrawler in Laravel 5.2
我正在使用 Laravel 5.2 命令并尝试使用 Symfony 组件 DomCrawler 获取表单。
所以,在 DomCrawler docs and api 的帮助下,我有这个代码:
use Illuminate\Console\Command;
use GuzzleHttp\Client as GuzzleClient;
use Symfony\Component\DomCrawler\Crawler;
并且,在 handle() 方法中:
$fake_body = '<html>
<head>
</head>
<body>
<div class="row search-filtro" style=" margin-top: 10px;">
<form id="search_form" action="http://somesite.com/">
<select class="form-control" id="slc_region" name="slc_region" form="form_busqueda" >
<option value="default" disabled selected style="display: none;">Ciudad</option>
<option value="default">Todo</option>
<option value="1">Región Metropolitana</option>
<option value="2">XV Arica y Parinacota</option>
</select>
<select class="form-control" id="slc_tipo" name="slc_tipo" form="form_busqueda" >
<option value="default" disabled selected style="display: none;">Categoría</option>
<option value="default">Todo</option><option value="Tiempo Libre">Tiempo Libre</option>
<option value="Otros">Otros</option><option value="Tecnología">Tecnología</option>
<option value="Salud, Deporte y Belleza">Salud, Deporte y Belleza</option>
<option value="Mi Casa">Mi Casa</option><option value="Infantil">Infantil</option>
<option value="Vestuario y Calzado">Vestuario y Calzado</option>
</select>
<input type="text" id="buscar_inp" name="buscar_inp" class="form-control" placeholder="Buscar Comercio..." >
<button type="button" id="buscar_btn" class="btn btn-search btn-lg col-sm-12">BUSCAR</button>
</form>
</div>
</body>
</html>
';
$site = new Crawler( $fake_body );
$form = $site->filter('form')->form();
我在 Laravel 命令中对此进行编程,所以当我 运行 在控制台中使用 php artisan scrap-site
时,我的脚本停止并显示此消息错误:
[InvalidArgumentException]
Current URI must be an absolute URL ("").
我已经尝试使用相对 url、绝对 url、http、https 和删除属性设置表单的操作属性,但总是出现相同的错误。
捕获和跟踪错误消息,我在 vendor/symfony/dom-crawler/AbstractUriElement.php
文件夹中找到了抽象 class AbstractUriElement 并且错误在 __construct 方法中。
/**
* @param \DOMElement $node A \DOMElement instance
* @param string $currentUri The URI of the page where the link is embedded (or the base href)
* @param string $method The method to use for the link (get by default)
*
* @throws \InvalidArgumentException if the node is not a link
*/
public function __construct(\DOMElement $node, $currentUri, $method = 'GET')
{
if (!in_array(strtolower(substr($currentUri, 0, 4)), array('http', 'file'))) {
throw new \InvalidArgumentException(sprintf('Current URI must be an absolute URL ("%s").', $currentUri));
}
$this->setNode($node);
$this->method = $method ? strtoupper($method) : null;
$this->currentUri = $currentUri;
}
回显$currentUri
参数,它是空的!! :(
有什么想法吗?
只需将 root
url 添加到爬虫中,就可以了
$fake_body = '<html>
<head>
</head>
<body>
<div class="row search-filtro" style=" margin-top: 10px;">
<form id="search_form" action="http://somesite.com/">
<select class="form-control" id="slc_region" name="slc_region" form="form_busqueda" >
<option value="default" disabled selected style="display: none;">Ciudad</option>
<option value="default">Todo</option>
<option value="1">Región Metropolitana</option>
<option value="2">XV Arica y Parinacota</option>
</select>
<select class="form-control" id="slc_tipo" name="slc_tipo" form="form_busqueda" >
<option value="default" disabled selected style="display: none;">Categoría</option>
<option value="default">Todo</option><option value="Tiempo Libre">Tiempo Libre</option>
<option value="Otros">Otros</option><option value="Tecnología">Tecnología</option>
<option value="Salud, Deporte y Belleza">Salud, Deporte y Belleza</option>
<option value="Mi Casa">Mi Casa</option><option value="Infantil">Infantil</option>
<option value="Vestuario y Calzado">Vestuario y Calzado</option>
</select>
<input type="text" id="buscar_inp" name="buscar_inp" class="form-control" placeholder="Buscar Comercio..." >
<button type="button" id="buscar_btn" class="btn btn-search btn-lg col-sm-12">BUSCAR</button>
</form>
</div>
</body>
</html>
';
$site = new Crawler( $fake_body, 'http://my-project.dev/' );
$form = $site->filter('form')->form();
我正在使用 Laravel 5.2 命令并尝试使用 Symfony 组件 DomCrawler 获取表单。 所以,在 DomCrawler docs and api 的帮助下,我有这个代码:
use Illuminate\Console\Command;
use GuzzleHttp\Client as GuzzleClient;
use Symfony\Component\DomCrawler\Crawler;
并且,在 handle() 方法中:
$fake_body = '<html>
<head>
</head>
<body>
<div class="row search-filtro" style=" margin-top: 10px;">
<form id="search_form" action="http://somesite.com/">
<select class="form-control" id="slc_region" name="slc_region" form="form_busqueda" >
<option value="default" disabled selected style="display: none;">Ciudad</option>
<option value="default">Todo</option>
<option value="1">Región Metropolitana</option>
<option value="2">XV Arica y Parinacota</option>
</select>
<select class="form-control" id="slc_tipo" name="slc_tipo" form="form_busqueda" >
<option value="default" disabled selected style="display: none;">Categoría</option>
<option value="default">Todo</option><option value="Tiempo Libre">Tiempo Libre</option>
<option value="Otros">Otros</option><option value="Tecnología">Tecnología</option>
<option value="Salud, Deporte y Belleza">Salud, Deporte y Belleza</option>
<option value="Mi Casa">Mi Casa</option><option value="Infantil">Infantil</option>
<option value="Vestuario y Calzado">Vestuario y Calzado</option>
</select>
<input type="text" id="buscar_inp" name="buscar_inp" class="form-control" placeholder="Buscar Comercio..." >
<button type="button" id="buscar_btn" class="btn btn-search btn-lg col-sm-12">BUSCAR</button>
</form>
</div>
</body>
</html>
';
$site = new Crawler( $fake_body );
$form = $site->filter('form')->form();
我在 Laravel 命令中对此进行编程,所以当我 运行 在控制台中使用 php artisan scrap-site
时,我的脚本停止并显示此消息错误:
[InvalidArgumentException]
Current URI must be an absolute URL ("").
我已经尝试使用相对 url、绝对 url、http、https 和删除属性设置表单的操作属性,但总是出现相同的错误。
捕获和跟踪错误消息,我在 vendor/symfony/dom-crawler/AbstractUriElement.php
文件夹中找到了抽象 class AbstractUriElement 并且错误在 __construct 方法中。
/**
* @param \DOMElement $node A \DOMElement instance
* @param string $currentUri The URI of the page where the link is embedded (or the base href)
* @param string $method The method to use for the link (get by default)
*
* @throws \InvalidArgumentException if the node is not a link
*/
public function __construct(\DOMElement $node, $currentUri, $method = 'GET')
{
if (!in_array(strtolower(substr($currentUri, 0, 4)), array('http', 'file'))) {
throw new \InvalidArgumentException(sprintf('Current URI must be an absolute URL ("%s").', $currentUri));
}
$this->setNode($node);
$this->method = $method ? strtoupper($method) : null;
$this->currentUri = $currentUri;
}
回显$currentUri
参数,它是空的!! :(
有什么想法吗?
只需将 root
url 添加到爬虫中,就可以了
$fake_body = '<html>
<head>
</head>
<body>
<div class="row search-filtro" style=" margin-top: 10px;">
<form id="search_form" action="http://somesite.com/">
<select class="form-control" id="slc_region" name="slc_region" form="form_busqueda" >
<option value="default" disabled selected style="display: none;">Ciudad</option>
<option value="default">Todo</option>
<option value="1">Región Metropolitana</option>
<option value="2">XV Arica y Parinacota</option>
</select>
<select class="form-control" id="slc_tipo" name="slc_tipo" form="form_busqueda" >
<option value="default" disabled selected style="display: none;">Categoría</option>
<option value="default">Todo</option><option value="Tiempo Libre">Tiempo Libre</option>
<option value="Otros">Otros</option><option value="Tecnología">Tecnología</option>
<option value="Salud, Deporte y Belleza">Salud, Deporte y Belleza</option>
<option value="Mi Casa">Mi Casa</option><option value="Infantil">Infantil</option>
<option value="Vestuario y Calzado">Vestuario y Calzado</option>
</select>
<input type="text" id="buscar_inp" name="buscar_inp" class="form-control" placeholder="Buscar Comercio..." >
<button type="button" id="buscar_btn" class="btn btn-search btn-lg col-sm-12">BUSCAR</button>
</form>
</div>
</body>
</html>
';
$site = new Crawler( $fake_body, 'http://my-project.dev/' );
$form = $site->filter('form')->form();