如何使用 Playwright Python 异步打开多个页面?
How do you open multiple pages asynchronously with Playwright Python?
我想使用 Playwright for Python 同时打开多个 url。但我正在努力弄清楚如何。这是来自异步文档:
async def main():
async with async_playwright() as p:
for browser_type in [p.chromium, p.firefox, p.webkit]:
browser = await browser_type.launch()
page = await browser.newPage()
await page.goto("https://scrapingant.com/")
await page.screenshot(path=f"scrapingant-{browser_type.name}.png")
await browser.close()
asyncio.get_event_loop().run_until_complete(main())
这将依次打开每个 browser_type。如果我想并行进行,我将如何去做?如果我想对 url 列表做类似的事情,我该怎么做?
我试过这样做:
urls = [
"https://scrapethissite.com/pages/ajax-javascript/#2015",
"https://scrapethissite.com/pages/ajax-javascript/#2014",
]
async def main(url):
async with async_playwright() as p:
browser = await p.chromium.launch(headless=False)
page = await browser.newPage()
await page.goto(url)
await browser.close()
async def go_to_url():
tasks = [main(url) for url in urls]
await asyncio.wait(tasks)
go_to_url()
但这给了我以下错误:
92: RuntimeWarning: coroutine 'go_to_url' was never awaited
go_to_url()
RuntimeWarning: Enable tracemalloc to get the object allocation traceback
我认为您需要使用相同的方法调用 go_to_url
函数:
asyncio.get_event_loop().run_until_complete(go_to_url())
我想使用 Playwright for Python 同时打开多个 url。但我正在努力弄清楚如何。这是来自异步文档:
async def main():
async with async_playwright() as p:
for browser_type in [p.chromium, p.firefox, p.webkit]:
browser = await browser_type.launch()
page = await browser.newPage()
await page.goto("https://scrapingant.com/")
await page.screenshot(path=f"scrapingant-{browser_type.name}.png")
await browser.close()
asyncio.get_event_loop().run_until_complete(main())
这将依次打开每个 browser_type。如果我想并行进行,我将如何去做?如果我想对 url 列表做类似的事情,我该怎么做?
我试过这样做:
urls = [
"https://scrapethissite.com/pages/ajax-javascript/#2015",
"https://scrapethissite.com/pages/ajax-javascript/#2014",
]
async def main(url):
async with async_playwright() as p:
browser = await p.chromium.launch(headless=False)
page = await browser.newPage()
await page.goto(url)
await browser.close()
async def go_to_url():
tasks = [main(url) for url in urls]
await asyncio.wait(tasks)
go_to_url()
但这给了我以下错误:
92: RuntimeWarning: coroutine 'go_to_url' was never awaited
go_to_url()
RuntimeWarning: Enable tracemalloc to get the object allocation traceback
我认为您需要使用相同的方法调用 go_to_url
函数:
asyncio.get_event_loop().run_until_complete(go_to_url())