CasperJS - 在尝试通过循环填充下拉菜单时访问页面的内容
CasperJS - Access page's content while trying to fill drop-down menu through a loop
我正在尝试使用 casperjs 进行一些测试,这里的特定情况是:
extracting city names from a drop-down menu, (Already Done)
then select each city (with casper.fill()
) which leads to load new
contents and URL change on the page, (Successful while testing with a single city name, Failed with loop through the list of cities' names)
go one level further through new loaded items' links (new pages),
finally, grab the content from each single page
我试图做一个循环来遍历城市列表并在每个循环中完成所有工作。但问题是 CasperJs
试图立即一个接一个地为每个城市设置 <option>
字段值,而不执行循环内的其余代码:
casper.then(function() {
var citiesLength = cities.length;
for (var i = 0; i < citiesLength; i++) {
this.fill('form.wpv-filter-form',{ //setting drop-down field value to the city names in order of the items in the array
'city[]': cityNames[i]
});
// Apparently the code below (to the end of the loop) doesn't get executed
casper.thenEvaluate(function() {
// Here the url change is being checked to know when the new content is loaded:
var regexString = '(\?)(city)(\[\])(=)(' + cityNames[i] + ')&';
var regex = new RegExp(regexString, "igm");
this.waitForUrl(regex, function(){
var name = this.getHTML('.kw-details-title');
link = this.evaluate(getFirstItemLink); // for test, just getting the first item's link
casper.open(link).then(function(){
this.echo("New Page is loaded......");
// Grab the single item contents
});
});
});
}
这是日志(为 3 个城市缩短):
[debug] [remote] Set "city[]" field value to city1
[info] [remote] attempting to fetch form element from selector: 'form.wpv-filter-form'
[debug] [remote] Set "city[]" field value to city2
[info] [remote] attempting to fetch form element from selector: 'form.wpv-filter-form'
[debug] [remote] Set "city[]" field value to city3
[info] [remote] attempting to fetch form element from selector: 'form.wpv-filter-form'
[info] [remote] attempting to fetch form element from selector: 'form.wpv-filter-form'
[info] [remote] attempting to fetch form element from selector: 'form.wpv-filter-form'
[info] [phantom] Step anonymous 5/5: done in 123069ms.
[info] [phantom] Step _step 6/79 https ://domain.com/section/ (HTTP 200)
[info] [phantom] Step _step 6/79: done in 123078ms.
P.s:使用casper.open()
是到达二级页面(项目页面)的好方法吗?获取内容后是否需要以某种方式关闭它们?
谢谢
很难给你一个准确的答案,因为你的问题无法重现。但是,我注意到您的脚本中存在一些问题...
1。避免 "nesting hell"
CasperJS 是围绕 步骤 组织的。使用此库,脚本通常如下所示:
casper.start('http://www.website.com/');
casper.then(function () {
// Step 1
});
casper.then(function () {
// Step 2
});
casper.then(function () {
// Step 3
});
casper.run();
then
方法不是promises,但它们有相同的objective:扁平化代码。所以当你嵌套到一定程度的时候,你显然做错了。
2。小心 evaluate
The concept behind this method is probably the most difficult to understand when discovering CasperJS. As a reminder, think of the evaluate() method as a gate between the CasperJS environment and the one of the page you have opened; everytime you pass a closure to evaluate(), you’re entering the page and execute code as if you were using the browser console.
在你的例子中,你在 thenEvaluate()
中使用了 this.evaluate()
。我确定这不是你想要做的...
3。 this
并不总是你所期望的那样
如果我们考虑前两点(嵌套和 evaluate
),您似乎没有以正确的方式使用 this
。当您在 PhantomJS/CasperJS 环境中时,this
是您的 casper
实例。但是在evaluate
里面,你是在页面DOM环境下,也就是说this
变成了window
。如果还是不清楚,这里有一个示例脚本:
var casper = require('casper').create();
casper.start('http://casperjs.org/');
casper.then(function () {
// "this" is "casper"
console.log(this.getCurrentUrl()); // http://casperjs.org/
});
casper.then(function () {
// "this" is "casper"
this.echo(this.evaluate(function () {
// "this" is "window"
return this.location.href; // http://casperjs.org/
}));
});
casper.run();
您的代码中有很多问题。就像不匹配步骤(then*
和 wait*
函数)一样,这意味着您将直接调用(casper.fill
)与步骤(thenEvaluate
)混合使用。
另一个问题是 this
没有在页面上下文中引用 casper
(在 evaluate
和 thenEvaluate
中)。
这应该有效:
cityNames.forEach(function(cityName){
casper.then(function(){
this.fill('form.wpv-filter-form', { //setting drop-down field value to the city names in order of the items in the array
'city[]': cityName
});
});
casper.then(function(){
var regexString = '(\?)(city)(\[\])(=)(' + cityName + ')&';
var regex = new RegExp(regexString, "igm");
this.waitForUrl(regex, function(){
var name = this.getHTML('.kw-details-title');
link = this.evaluate(getFirstItemLink); // for test, just getting the first item's link
this.thenOpen(link).then(function(){
this.echo("New Page is loaded......");
// Grab the single item contents
});
});
});
});
我正在尝试使用 casperjs 进行一些测试,这里的特定情况是:
extracting city names from a drop-down menu, (Already Done)
then select each city (with
casper.fill()
) which leads to load new contents and URL change on the page, (Successful while testing with a single city name, Failed with loop through the list of cities' names)go one level further through new loaded items' links (new pages),
finally, grab the content from each single page
我试图做一个循环来遍历城市列表并在每个循环中完成所有工作。但问题是 CasperJs
试图立即一个接一个地为每个城市设置 <option>
字段值,而不执行循环内的其余代码:
casper.then(function() {
var citiesLength = cities.length;
for (var i = 0; i < citiesLength; i++) {
this.fill('form.wpv-filter-form',{ //setting drop-down field value to the city names in order of the items in the array
'city[]': cityNames[i]
});
// Apparently the code below (to the end of the loop) doesn't get executed
casper.thenEvaluate(function() {
// Here the url change is being checked to know when the new content is loaded:
var regexString = '(\?)(city)(\[\])(=)(' + cityNames[i] + ')&';
var regex = new RegExp(regexString, "igm");
this.waitForUrl(regex, function(){
var name = this.getHTML('.kw-details-title');
link = this.evaluate(getFirstItemLink); // for test, just getting the first item's link
casper.open(link).then(function(){
this.echo("New Page is loaded......");
// Grab the single item contents
});
});
});
}
这是日志(为 3 个城市缩短):
[debug] [remote] Set "city[]" field value to city1
[info] [remote] attempting to fetch form element from selector: 'form.wpv-filter-form'
[debug] [remote] Set "city[]" field value to city2
[info] [remote] attempting to fetch form element from selector: 'form.wpv-filter-form'
[debug] [remote] Set "city[]" field value to city3
[info] [remote] attempting to fetch form element from selector: 'form.wpv-filter-form'
[info] [remote] attempting to fetch form element from selector: 'form.wpv-filter-form'
[info] [remote] attempting to fetch form element from selector: 'form.wpv-filter-form'
[info] [phantom] Step anonymous 5/5: done in 123069ms.
[info] [phantom] Step _step 6/79 https ://domain.com/section/ (HTTP 200)
[info] [phantom] Step _step 6/79: done in 123078ms.
P.s:使用casper.open()
是到达二级页面(项目页面)的好方法吗?获取内容后是否需要以某种方式关闭它们?
谢谢
很难给你一个准确的答案,因为你的问题无法重现。但是,我注意到您的脚本中存在一些问题...
1。避免 "nesting hell"
CasperJS 是围绕 步骤 组织的。使用此库,脚本通常如下所示:
casper.start('http://www.website.com/');
casper.then(function () {
// Step 1
});
casper.then(function () {
// Step 2
});
casper.then(function () {
// Step 3
});
casper.run();
then
方法不是promises,但它们有相同的objective:扁平化代码。所以当你嵌套到一定程度的时候,你显然做错了。
2。小心 evaluate
The concept behind this method is probably the most difficult to understand when discovering CasperJS. As a reminder, think of the evaluate() method as a gate between the CasperJS environment and the one of the page you have opened; everytime you pass a closure to evaluate(), you’re entering the page and execute code as if you were using the browser console.
在你的例子中,你在 thenEvaluate()
中使用了 this.evaluate()
。我确定这不是你想要做的...
3。 this
并不总是你所期望的那样
如果我们考虑前两点(嵌套和 evaluate
),您似乎没有以正确的方式使用 this
。当您在 PhantomJS/CasperJS 环境中时,this
是您的 casper
实例。但是在evaluate
里面,你是在页面DOM环境下,也就是说this
变成了window
。如果还是不清楚,这里有一个示例脚本:
var casper = require('casper').create();
casper.start('http://casperjs.org/');
casper.then(function () {
// "this" is "casper"
console.log(this.getCurrentUrl()); // http://casperjs.org/
});
casper.then(function () {
// "this" is "casper"
this.echo(this.evaluate(function () {
// "this" is "window"
return this.location.href; // http://casperjs.org/
}));
});
casper.run();
您的代码中有很多问题。就像不匹配步骤(then*
和 wait*
函数)一样,这意味着您将直接调用(casper.fill
)与步骤(thenEvaluate
)混合使用。
另一个问题是 this
没有在页面上下文中引用 casper
(在 evaluate
和 thenEvaluate
中)。
这应该有效:
cityNames.forEach(function(cityName){
casper.then(function(){
this.fill('form.wpv-filter-form', { //setting drop-down field value to the city names in order of the items in the array
'city[]': cityName
});
});
casper.then(function(){
var regexString = '(\?)(city)(\[\])(=)(' + cityName + ')&';
var regex = new RegExp(regexString, "igm");
this.waitForUrl(regex, function(){
var name = this.getHTML('.kw-details-title');
link = this.evaluate(getFirstItemLink); // for test, just getting the first item's link
this.thenOpen(link).then(function(){
this.echo("New Page is loaded......");
// Grab the single item contents
});
});
});
});