如何让 Puppeteer / Node 脚本通过 class 读取 <div>?
How to get Puppeteer / Node script to read a <div> by its class?
有没有办法在 Node Puppeteer webscraper 中通过其 class select 一个 <div>
?
网页上有一个div
是这样的:<div class="Body-body-qL80Q">
我想让我的抓取工具从这个 div
中抓取文本。目前,我只是试图将它写入控制台以检查它是否抓取了正确的文本。
我的 querySelector
怎么了? (我之前让脚本导航到正确的页面并截取屏幕截图,它做对了,所以我知道它的其余部分有效。)
const puppeteer = require('puppeteer');
const CREDS = require('./creds');
(async () => {
const browser = await puppeteer.launch({ headless: true });
const page = await browser.newPage();
await page.goto('https://www.squarespace.com/login');
const USERNAME_SELECTOR = '.username.Input-hxTtdt.ipapEE';
const PASSWORD_SELECTOR = '.password.Input-hxTtdt.ipapEE';
const BUTTON_SELECTOR = '.Button-kDSBcD.fATVqu';
await page.click(USERNAME_SELECTOR);
await page.keyboard.type(CREDS.username);
await page.click(PASSWORD_SELECTOR);
await page.keyboard.type(CREDS.password);
await Promise.all([
page.waitForNavigation(),
page.click(BUTTON_SELECTOR),
]);
await page.goto('https://triangle-oarfish-hk88.squarespace.com/config/analytics#activity-log');
const textContent = await page.evaluate(() => document.querySelector('Body-body-qL80Q').className);
console.log(textContent);
await browser.close();
})();
这是错误:
(node:6116) UnhandledPromiseRejectionWarning: Error: Evaluation failed: TypeError: Cannot read property 'className' of null
(node:6116) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). (rejection id: 1)
(node:6116) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.
您忘记在 document.querySelector()
function inside page.evaluate()
中的 class 选择器 Body-body-qL80Q
前添加句点 .
:
此外,您应该使用 textContent
property instead of the className
属性。
您的常量 textContent
应按以下方式初始化:
const textContent = await page.evaluate(() => document.querySelector('.Body-body-qL80Q').textContent);
有没有办法在 Node Puppeteer webscraper 中通过其 class select 一个 <div>
?
网页上有一个div
是这样的:<div class="Body-body-qL80Q">
我想让我的抓取工具从这个 div
中抓取文本。目前,我只是试图将它写入控制台以检查它是否抓取了正确的文本。
我的 querySelector
怎么了? (我之前让脚本导航到正确的页面并截取屏幕截图,它做对了,所以我知道它的其余部分有效。)
const puppeteer = require('puppeteer');
const CREDS = require('./creds');
(async () => {
const browser = await puppeteer.launch({ headless: true });
const page = await browser.newPage();
await page.goto('https://www.squarespace.com/login');
const USERNAME_SELECTOR = '.username.Input-hxTtdt.ipapEE';
const PASSWORD_SELECTOR = '.password.Input-hxTtdt.ipapEE';
const BUTTON_SELECTOR = '.Button-kDSBcD.fATVqu';
await page.click(USERNAME_SELECTOR);
await page.keyboard.type(CREDS.username);
await page.click(PASSWORD_SELECTOR);
await page.keyboard.type(CREDS.password);
await Promise.all([
page.waitForNavigation(),
page.click(BUTTON_SELECTOR),
]);
await page.goto('https://triangle-oarfish-hk88.squarespace.com/config/analytics#activity-log');
const textContent = await page.evaluate(() => document.querySelector('Body-body-qL80Q').className);
console.log(textContent);
await browser.close();
})();
这是错误:
(node:6116) UnhandledPromiseRejectionWarning: Error: Evaluation failed: TypeError: Cannot read property 'className' of null
(node:6116) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). (rejection id: 1)
(node:6116) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.
您忘记在 document.querySelector()
function inside page.evaluate()
中的 class 选择器 Body-body-qL80Q
前添加句点 .
:
此外,您应该使用 textContent
property instead of the className
属性。
您的常量 textContent
应按以下方式初始化:
const textContent = await page.evaluate(() => document.querySelector('.Body-body-qL80Q').textContent);