尝试使用 python、pypeteer 进行 websrap

Question

目标是从一个跟踪 tiktok 追随者的网站上提取信息，并 post 它在不和谐频道的 console/send 中。当前使用 discord 来启动它，但在控制台中打印它。下面列出的当前代码打印：

[<pyppeteer.element_handle.ElementHandle object at 0x00000214B2703640>]

@bot.command()
async def stats(ctx):
    statspage = await browser.newPage()
    await statspage.goto('https://livecounts.io/tiktok-live-follower-counter/charlieputh')
    t = await statspage.xpath('//*[@id="__next"]/div/div/div[3]/div[2]/div/div/div/div')
    print(t)

我希望它 return 符合该页面上列出的关注者数量。请帮忙。

Answer 1

page.xpath 函数为您提供元素列表，而不是文本。如果您想获取元素的文本，则需要对其进行评估，例如：

elements = await statspage.xpath('//*[@id="__next"]/div/div/div[3]/div[2]/div/div/div/div')
text = await page.evaluate("e => e.innerText", elements[0])

您可能知道，pyppeteer 是一个非官方的 Python 版本的 puppeteer，因此您应该查看 documentation of puppeteer to see how it works. And also docs of pyppeteer 以了解 Python 版本之间的差异。

尝试使用 python、pypeteer 进行 websrap

Trying to websrap with python, pypeteer

python

discord.py

pyppeteer