如何将变量传递给评估函数?

How can I pass variable into an evaluate function?

我试图将变量传递给 Puppeteer 中的 page.evaluate() 函数,但是当我使用以下非常简化的示例时,变量 evalVar 未定义。

我是 Puppeteer 的新手,找不到任何可以构建的示例,所以我需要帮助将该变量传递给 page.evaluate() 函数,以便我可以在其中使用它。

const puppeteer = require('puppeteer');

(async() => {

  const browser = await puppeteer.launch({headless: false});
  const page = await browser.newPage();

  const evalVar = 'WHUT??';

  try {

    await page.goto('https://www.google.com.au');
    await page.waitForSelector('#fbar');
    const links = await page.evaluate((evalVar) => {

      console.log('evalVar:', evalVar); // appears undefined

      const urls = [];
      hrefs = document.querySelectorAll('#fbar #fsl a');
      hrefs.forEach(function(el) {
        urls.push(el.href);
      });
      return urls;
    })
    console.log('links:', links);

  } catch (err) {

    console.log('ERR:', err.message);

  } finally {

    // browser.close();

  }

})();

您必须像这样将变量作为参数传递给 pageFunction

const links = await page.evaluate((evalVar) => {

  console.log(evalVar); // 2. should be defined now
  …

}, evalVar); // 1. pass variable as an argument

您可以通过向 page.evaluate() 传递更多参数来传递多个变量:

await page.evaluate((a, b c) => { console.log(a, b, c) }, a, b, c)

参数必须可序列化为 JSON 或 JSHandles of in-browser objects: https://pptr.dev/#?show=api-pageevaluatepagefunction-args

我鼓励你坚持这种风格,因为它更方便并且可读

let name = 'jack';
let age  = 33;
let location = 'Berlin/Germany';

await page.evaluate(({name, age, location}) => {

    console.log(name);
    console.log(age);
    console.log(location);

},{name, age, location});

单变量:

您可以使用以下语法将一个变量传递给page.evaluate()

await page.evaluate(example => { /* ... */ }, example);

Note: You do not need to enclose the variable in (), unless you are going to be passing multiple variables.

多个变量:

您可以使用以下语法将 多个变量 传递给 page.evaluate()

await page.evaluate((example_1, example_2) => { /* ... */ }, example_1, example_2);

Note: Enclosing your variables within {} is not necessary.

要传递 function,有两种方法可以做到。

// 1. Defined in evaluationContext
await page.evaluate(() => {
  window.yourFunc = function() {...};
});
const links = await page.evaluate(() => {
  const func = window.yourFunc;
  func();
});


// 2. Transform function to serializable(string). (Function can not be serialized)
const yourFunc = function() {...};
const obj = {
  func: yourFunc.toString()
};
const otherObj = {
  foo: 'bar'
};
const links = await page.evaluate((obj, aObj) => {
   const funStr = obj.func;
   const func = new Function(`return ${funStr}.apply(null, arguments)`)
   func();

   const foo = aObj.foo; // bar, for object
   window.foo = foo;
   debugger;
}, obj, otherObj);

您可以将 devtools: true 添加到测试的启动选项中

我花了很长时间才弄清楚 evaluate() 中的 console.log() 无法在节点控制台中显示。

参考:https://github.com/GoogleChrome/puppeteer/issues/1944

everything that is run inside the page.evaluate function is done in the context of the browser page. The script is running in the browser not in node.js so if you log it will show in the browsers console which if you are running headless you will not see. You also can't set a node breakpoint inside the function.

希望对您有所帮助。

我有一个打字稿示例,可以帮助打字稿新手。

const hyperlinks: string [] = await page.evaluate((url: string, regex: RegExp, querySelect: string) => {
.........
}, url, regex, querySelect);

与上面@wolf 回答的版本略有不同。使代码在不同上下文之间更具可重用性。

// util functions
export const pipe = (...fns) => initialVal => fns.reduce((acc, fn) => fn(acc), initialVal)
export const pluck = key => obj => obj[key] || null
export const map = fn => item => fn(item)
// these variables will be cast to string, look below at fn.toString()
const updatedAt = await page.evaluate(
  ([selector, util]) => {
    let { pipe, map, pluck } = util
    pipe = new Function(`return ${pipe}`)()
    map = new Function(`return ${map}`)()
    pluck = new Function(`return ${pluck}`)()

    return pipe(
      s => document.querySelector(s),
      pluck('textContent'),
      map(text => text.trim()),
      map(date => Date.parse(date)),
      map(timeStamp => Promise.resolve(timeStamp))
    )(selector)
  },
  [
    '#table-announcements tbody td:nth-child(2) .d-none',
    { pipe: pipe.toString(), map: map.toString(), pluck: pluck.toString() },
  ]
)

也不是管道内的函数不能使用这样的东西

// incorrect, which is i don't know why
pipe(document.querySelector) 

// should be 
pipe(s => document.querySelector(s))