如何让 Puppeteer / Node 脚本通过 class 读取 <div>?

How to get Puppeteer / Node script to read a <div> by its class?

有没有办法在 Node Puppeteer webscraper 中通过其 class select 一个 <div>

网页上有一个div是这样的:<div class="Body-body-qL80Q">

我想让我的抓取工具从这个 div 中抓取文本。目前,我只是试图将它写入控制台以检查它是否抓取了正确的文本。

我的 querySelector 怎么了? (我之前让脚本导航到正确的页面并截取屏幕截图,它做对了,所以我知道它的其余部分有效。)

const puppeteer = require('puppeteer');
const CREDS = require('./creds');

(async () => {
  const browser = await puppeteer.launch({ headless: true });

  const page = await browser.newPage();

  await page.goto('https://www.squarespace.com/login');

  const USERNAME_SELECTOR = '.username.Input-hxTtdt.ipapEE';
  const PASSWORD_SELECTOR = '.password.Input-hxTtdt.ipapEE';
  const BUTTON_SELECTOR = '.Button-kDSBcD.fATVqu';

  await page.click(USERNAME_SELECTOR);
  await page.keyboard.type(CREDS.username);

  await page.click(PASSWORD_SELECTOR);
  await page.keyboard.type(CREDS.password);

  await Promise.all([
    page.waitForNavigation(),
    page.click(BUTTON_SELECTOR),
  ]);

  await page.goto('https://triangle-oarfish-hk88.squarespace.com/config/analytics#activity-log');

  const textContent = await page.evaluate(() => document.querySelector('Body-body-qL80Q').className);

  console.log(textContent);

  await browser.close();
})();

这是错误:

(node:6116) UnhandledPromiseRejectionWarning: Error: Evaluation failed: TypeError: Cannot read property 'className' of null
(node:6116) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). (rejection id: 1)
(node:6116) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.

您忘记在 document.querySelector() function inside page.evaluate() 中的 class 选择器 Body-body-qL80Q 前添加句点 .:

此外,您应该使用 textContent property instead of the className 属性。

您的常量 textContent 应按以下方式初始化:

const textContent = await page.evaluate(() => document.querySelector('.Body-body-qL80Q').textContent);