如何从 puppeteer/playwright JSelement 节点中删除一个子节点,然后获取 innerText
How to remove a child from a puppeteer/playwright JSelement node and then fetch innerText
我可以使用 playwright/puppeteer 获取单元格。我想分别捕获以下两个值 - 日期和状态。
我有以下代码:
let allCells = await allRows[0].$$('[role="cell"]');
let ele = await allCells[0].$('.description');
let status = await (await ele.getProperty("innerText")).jsonValue();
// I can get the status as 'uploaded' just fine using this
allCells[0].removeChild(ele); // this throws an error
let uploadDate = await (await allCells[0]("innerText")).jsonValue();
它抛出的错误是:
类型错误:allCells[0].removeChild 不是函数
console.log( allCells[0] ) returns:
JSHandle@....
这是 HTML 的相关部分:
<html>
<body>
<div role="cell" class="cell-body">
<!---->Jul 11, 2021
<div class="description">
uploaded
</div>
</div>
</body>
</html>
遗憾的是,您无法在 JS 或 puppeteer (Node.js) 上下文中调用 web API 方法 (.removeChild
) 或元素句柄。
您可以尝试使用类似这样的方法获取浏览器上下文中的所有数据(.childNodes[0]
只会给您第一个文本节点,直到 <div class="description">
元素):
import puppeteer from 'puppeteer';
const browser = await puppeteer.launch();
const html = `
<html>
<body>
<div role="cell" class="cell-body">
Jul 11, 2021
<div class="description">
uploaded
</div>
</div>
</body>
</html>`;
try {
const [page] = await browser.pages();
await page.goto(`data:text/html,${html}`);
const data = await page.evaluate(() => {
const date = document.querySelector('div.cell-body').childNodes[0].textContent.trim();
const description = document.querySelector('div.description').innerText;
return [date, description];
});
console.log(data);
} catch (err) { console.error(err); } finally { await browser.close(); }
我可以使用 playwright/puppeteer 获取单元格。我想分别捕获以下两个值 - 日期和状态。
我有以下代码:
let allCells = await allRows[0].$$('[role="cell"]');
let ele = await allCells[0].$('.description');
let status = await (await ele.getProperty("innerText")).jsonValue();
// I can get the status as 'uploaded' just fine using this
allCells[0].removeChild(ele); // this throws an error
let uploadDate = await (await allCells[0]("innerText")).jsonValue();
它抛出的错误是: 类型错误:allCells[0].removeChild 不是函数
console.log( allCells[0] ) returns: JSHandle@....
这是 HTML 的相关部分:
<html>
<body>
<div role="cell" class="cell-body">
<!---->Jul 11, 2021
<div class="description">
uploaded
</div>
</div>
</body>
</html>
遗憾的是,您无法在 JS 或 puppeteer (Node.js) 上下文中调用 web API 方法 (.removeChild
) 或元素句柄。
您可以尝试使用类似这样的方法获取浏览器上下文中的所有数据(.childNodes[0]
只会给您第一个文本节点,直到 <div class="description">
元素):
import puppeteer from 'puppeteer';
const browser = await puppeteer.launch();
const html = `
<html>
<body>
<div role="cell" class="cell-body">
Jul 11, 2021
<div class="description">
uploaded
</div>
</div>
</body>
</html>`;
try {
const [page] = await browser.pages();
await page.goto(`data:text/html,${html}`);
const data = await page.evaluate(() => {
const date = document.querySelector('div.cell-body').childNodes[0].textContent.trim();
const description = document.querySelector('div.description').innerText;
return [date, description];
});
console.log(data);
} catch (err) { console.error(err); } finally { await browser.close(); }