如何使用 Python Playwright 通过向其提供 URL 列表来打开新选项卡?

How to open a new tab using Python Playwright by feeding it a list of URLs?

根据Playwright文档,在浏览器中打开新标签页的方式如scrap_post_info()函数所示?但是,它没有这样做。

我目前正在尝试做的是遍历 posts 列表变量中的每个 URL,然后打开 link 或 URL用于废弃 post 详细信息的新选项卡。完成抓取 post 后,该选项卡将关闭并继续在新选项卡中打开下一个 link 以再次抓取 post 详细信息,直到到达最后一个 [=24] =] 在 posts 列表变量中。

# Loop through each URL from the `posts` list variable that contains many posts' URLs
for post in posts:
    scrap_post_info(context, post)

def scrap_post_info(context, post):

    with context.expect_page() as new_page_info:
        page.click('a[target="_blank"]')  # Opens a new tab
    new_page = new_page_info.value

    new_page.wait_for_load_state()
    print(new_page.title())

为我的项目做类似的事情,这就是我会做的。

from playwright.sync_api import sync_playwright

posts = ['https://playwright.dev/','https://playwright.dev/python/',]

def scrap_post_info(context, post):
    page = context.new_page()
    page.goto(post)
    print(page.title())
    # do whatever scraping you need to
    page.close()

with sync_playwright() as p:
    browser = p.chromium.launch()
    context = browser.new_context()
    for post in posts:
        scrap_post_info(context, post)
        # some time delay

browser.close()

剧作家文档中的 code snippet 更多是关于在现有页面上单击 link 后打开新页面。由于您已经准备好 url,您可以逐页访问每个页面,然后进行抓取。