Puppeteer,等待选择器,并从内部返回数据
Puppeteer, awaiting a selector, and returning data from within
我正在加载一个页面,拦截它的请求,当某个元素出现时我停止加载并提取我需要的数据...
这是我面临的问题。
简化后的代码看起来实际上是这样的:
async function loadPage()
{
var contentLoaded = false;
var content;
//now i say when element shows up, do something
// it is page.waitForSelector but for simplicity, i use a timeout
// because the problem is the same
//i set it to "show up" in 10 seconds here.
//when it shows up, it sets the content to 100 (extracts the content i want)
//and stores it..
setTimeout(()=>{
content = 100;
contentLoaded = true;
},10000)
//Here i have a function that loads the page
//Intercepts request and handles them
//Until content is loaded
page.on('request', req =>{
if(!contentLoaded)
{
// keep loading page
}
})
// this is the piece of code i would like to not run,
// UNTIL i either get the data, or a timeout error
// from page.waitForSelector...
//but javascript will run it if it's not busy with the
//loading function above...
// In 10 seconds the content shows
// and it's stored in DATA, but this piece of code has
// already finished by the time that is done...
// and it returns false...
if(contentLoaded)
{return content}
else
{return false}
}
var x = loadPage();
x.then(console.log); //should log the data or false if error occured
感谢大家花时间阅读本文并提供帮助,我是新手,所以如果您认为我有不完全理解的地方,欢迎任何反馈甚至阅读 material
已解决
简单说明:
这是我试图完成的:
- 拦截页面请求,以便我可以决定不加载什么,并加快加载速度
- 一旦元素出现在页面上,我想提取一些数据并return它。
我试图 return 像这样:(请注意,所有浏览器和错误处理都将被排除在外,因为它只会使解释混乱)
var data = loadPage(url);
async function loadPage(URL)
{
var data;
page.waitForSelector(
var x = //page.evaluate returns data to x...
data = x;
)
return data;
}
这不起作用,因为 return 立即运行但 waitForSelector 稍后运行,所以我们总是 return 未定义...
正确的做法,或者更确切地说,它对我有用的方式是 return 整个承诺,然后提取数据...
var data = loadPage(url);
data.then(//do what needs to be done with the data);
async function loadPage(URL)
{
var data = page.waitForSelector(
var x = //page.evaluate returns data to x...
data = x;
)
return data; // we return data as a promise
}
我希望这是一个足够可靠的解释,如果有人需要查看整个交易,我可以编辑问题并将整个代码放在那里...
我正在加载一个页面,拦截它的请求,当某个元素出现时我停止加载并提取我需要的数据...
这是我面临的问题。
简化后的代码看起来实际上是这样的:
async function loadPage()
{
var contentLoaded = false;
var content;
//now i say when element shows up, do something
// it is page.waitForSelector but for simplicity, i use a timeout
// because the problem is the same
//i set it to "show up" in 10 seconds here.
//when it shows up, it sets the content to 100 (extracts the content i want)
//and stores it..
setTimeout(()=>{
content = 100;
contentLoaded = true;
},10000)
//Here i have a function that loads the page
//Intercepts request and handles them
//Until content is loaded
page.on('request', req =>{
if(!contentLoaded)
{
// keep loading page
}
})
// this is the piece of code i would like to not run,
// UNTIL i either get the data, or a timeout error
// from page.waitForSelector...
//but javascript will run it if it's not busy with the
//loading function above...
// In 10 seconds the content shows
// and it's stored in DATA, but this piece of code has
// already finished by the time that is done...
// and it returns false...
if(contentLoaded)
{return content}
else
{return false}
}
var x = loadPage();
x.then(console.log); //should log the data or false if error occured
感谢大家花时间阅读本文并提供帮助,我是新手,所以如果您认为我有不完全理解的地方,欢迎任何反馈甚至阅读 material
已解决
简单说明:
这是我试图完成的:
- 拦截页面请求,以便我可以决定不加载什么,并加快加载速度
- 一旦元素出现在页面上,我想提取一些数据并return它。
我试图 return 像这样:(请注意,所有浏览器和错误处理都将被排除在外,因为它只会使解释混乱)
var data = loadPage(url);
async function loadPage(URL)
{
var data;
page.waitForSelector(
var x = //page.evaluate returns data to x...
data = x;
)
return data;
}
这不起作用,因为 return 立即运行但 waitForSelector 稍后运行,所以我们总是 return 未定义...
正确的做法,或者更确切地说,它对我有用的方式是 return 整个承诺,然后提取数据...
var data = loadPage(url);
data.then(//do what needs to be done with the data);
async function loadPage(URL)
{
var data = page.waitForSelector(
var x = //page.evaluate returns data to x...
data = x;
)
return data; // we return data as a promise
}
我希望这是一个足够可靠的解释,如果有人需要查看整个交易,我可以编辑问题并将整个代码放在那里...