在 python 剧作家中打印页面源代码

print page source in python playwright

我有 PHP 脚本,我正在使用带有 URL 参数的代码调用 python 函数:

import json
import sys
import urllib.parse
link = urllib.parse.unquote(sys.argv[1])
from playwright.sync_api import sync_playwright
with sync_playwright() as p:
    browser = p.chromium.launch()
    context = browser.new_context(user_agent='Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.114 Safari/537.36')
    page = context.new_page()
    cookie_file = open('./cookies.json')
    cookies = json.load(cookie_file)
    print(cookies)
    context.add_cookies(cookies)
    page.goto(link)
    try:
        page.wait_for_timeout(10000)
        print(page.innerHTML("*"))
        page.close()
        context.close()
        browser.close()      
    except Exception as e:
        print("Error in playwright script.")
        page.close()
        context.close()
        browser.close()     

但是,当我访问页面后想打印页面源时,我收到

Error in playwright script.

因为我试过的代码不起作用:

print(page.innerHTML("*"))

有什么帮助吗?

要获取页面的完整 HTML 内容,您可以使用 page.content()