Puppeteer,等待选择器,并从内部返回数据

Puppeteer, awaiting a selector, and returning data from within

我正在加载一个页面,拦截它的请求,当某个元素出现时我停止加载并提取我需要的数据...

这是我面临的问题。

简化后的代码看起来实际上是这样的:

async function loadPage()
    {
        var contentLoaded = false;
        var content;
        
        //now i say when element shows up, do something
        // it is page.waitForSelector but for simplicity, i use a timeout
        // because the problem is the same
        //i set it to "show up" in 10 seconds here.
        //when it shows up, it sets the content to 100 (extracts the content i want)
        //and stores it..
        
        setTimeout(()=>{
            content = 100;
            contentLoaded = true;
        },10000)



        //Here i have a function that loads the page
        //Intercepts request and handles them
        //Until content is loaded
        
        page.on('request', req =>{
             if(!contentLoaded)
             {
                 // keep loading page
             }
          })  
       

        // this is the piece of code i would like to not run,
        // UNTIL i either get the data, or a timeout error
        // from page.waitForSelector...
        
        //but javascript will run it if it's not busy with the
        //loading function above...
        
        // In 10 seconds the content shows 
        // and it's stored in DATA, but this piece of code has
        // already finished by the time that is done...
        // and it returns false...
        
        if(contentLoaded)
            {return content}
        else
            {return false}

    }

var x = loadPage();
x.then(console.log); //should log the data or false if error occured

感谢大家花时间阅读本文并提供帮助,我是新手,所以如果您认为我有不完全理解的地方,欢迎任何反馈甚至阅读 material

已解决

简单说明:

这是我试图完成的:

  1. 拦截页面请求,以便我可以决定不加载什么,并加快加载速度
  2. 一旦元素出现在页面上,我想提取一些数据并return它。

我试图 return 像这样:(请注意,所有浏览器和错误处理都将被排除在外,因为它只会使解释混乱)

var data = loadPage(url);

async function loadPage(URL)
    {
     var data;

     page.waitForSelector(
         var x = //page.evaluate returns data to x...
         data = x;
     )
    return data;
    }

这不起作用,因为 return 立即运行但 waitForSelector 稍后运行,所以我们总是 return 未定义...

正确的做法,或者更确切地说,它对我有用的方式是 return 整个承诺,然后提取数据...

var data = loadPage(url);
data.then(//do what needs to be done with the data);  

async function loadPage(URL)
    {
    var data = page.waitForSelector(
         var x = //page.evaluate returns data to x...
         data = x;
     )
    return data; // we return data as a promise
    }

我希望这是一个足够可靠的解释,如果有人需要查看整个交易,我可以编辑问题并将整个代码放在那里...