PhantomJS 在按钮单击事件后捕获下一页内容

Question

我试图在点击方法后捕获第二页内容。但它正在返回首页内容。

const status = await page.open('https://www.dubailand.gov.ae/English/services/Eservices/Pages/Brokers.aspx');
console.log(status);
await page.evaluate(function() {
    document.querySelector('#ctl00_ctl42_g_26779dcd_6f3a_42ae_903c_59dea61690e9_dpPager > a.NextPageLink').click();
})

const content = await page.property('content');
console.log(content);

我使用 puppeteer 完成了类似的任务，但由于 puppeteer 的部署问题而转向 phantomjs。感谢您的帮助。

Answer 1

您获得首页是因为您在单击 "next" 按钮后立即请求页面内容，但需要等待 Ajax 请求完成。可以通过观察一个"tree palm" ajax loader来完成：当它不可见时，结果就在

// Utility function to pass time: await timeout(ms)
const timeout = ms => new Promise(resolve => setTimeout(resolve, ms));

// emulate a realistic client's screen size
await page.property('viewportSize', { width: 1280, height: 720 });

const status = await page.open('https://www.dubailand.gov.ae/English/services/Eservices/Pages/Brokers.aspx');

await page.evaluate(function() {
    document.querySelector('#ctl00_ctl42_g_26779dcd_6f3a_42ae_903c_59dea61690e9_dpPager > a.NextPageLink').click();
});

// Give it time to start request
await timeout(1000);

// Wait until the loader is gone
while(1 == await page.evaluate(function(){ 
    return jQuery(".Loader_large:visible").length 
}))
{
    await timeout(1000);
    console.log(".");
}

// Now for scraping
let contacts = await page.evaluate(function(){

    var contacts = []; 
    jQuery("#tbBrokers tr").each(function(i, row){
        contacts.push({"title" : jQuery(row).find("td:nth-child(2)").text().trim(), "phone" : jQuery(row).find("td:nth-child(4)").text().trim() })
    })

    return contacts;
});

console.log(contacts);

PhantomJS 在按钮单击事件后捕获下一页内容

PhantomJS to capture next page content after button click event

javascript

node.js

headless-browser

phantomjs