Selenium 不会加载完整的页面源,只能部分通过 CSS 样式然后切断

Selenium will not load full Page Source, only partially through CSS styles and then cuts off

我试过在 Stack Overflow 上查看几个答案,但都无济于事。当我打印网页的页面源代码时,我只能看到标签内某个点的源代码,给或带几个字符。超出的 HTML 元素永远不会在页面源中加载或打印出来。当我尝试加载 应该 存在的 HTML 元素时(当我在 Chrome 上查看页面源时它们就在那里),我得到一个 TimeoutExceptionNoSuchElementException.

我在通过多重身份验证门户后解析动态加载的网站。我打印了 driver.current_url 以确保我在 MFA 之后处于正确的 URL,尝试了 sleep(100) 并尝试显式等待 EC.url_contains(...)EC.element_to_be_clickable(...)EC.presence_of_element_located(...).

这是我的代码:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

url = "https://brightspace.nyu.edu/d2l/home"

driver = webdriver.Chrome()     # should open a Chrome window
driver.get(url)         # navigate to brightspace

# MFA Handling Code here #

# Explicitly wait until we reach the Brightspace home page (logged in)   
element = WebDriverWait(driver,100).until(EC.url_contains('https://brightspace.nyu.edu/d2l/home'))
print(driver.page_source)
banner = driver.find_element_by_id('bannerTitle')   # throws NoSuchElementException

这是输出的一部分:

        <!-- ... previous styles and HTML in <head> ... -->
        <style is="custom-style">html {
                        --d2l-color-woolonardo: var(--d2l-color-sylvite);
                        .
                        .   lots of colors
                        .
                        --d2l-color-olivine-light-1: var(--d2l-color-olivine-plus-1);
                        --d2l
                        <!-- ^^ the page source cuts off here, in <head> -->

最后一行出现以下错误:

selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"css selector","selector":"[id="bannerTitle"]"}

我建议使用 WebDriverWaitByEC 而不是 banner = driver.find_element_by_id。我还会在找到横幅后放置 print(driver.page_source)。我们也可以尝试向下滚动页面。我在下面注释掉了您的一些行并添加了我建议的更新。

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

url = "https://brightspace.nyu.edu/d2l/home"

driver = webdriver.Chrome()     # should open a Chrome window
driver.get(url)         # navigate to brightspace

# MFA Handling Code here #

# Explicitly wait until we reach the Brightspace home page (logged in)   
element = WebDriverWait(driver,100).until(EC.url_contains('https://brightspace.nyu.edu/d2l/home'))
# print(driver.page_source)
# banner = driver.find_element_by_id('bannerTitle')   # throws NoSuchElementException
##################################
######## NEW SUGGESTIONS #########
##################################
banner = WebDriverWait(self.driver, 100).until(EC.visibility_of_element_located(
        (By.ID, "bannerTitle")))
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
print(driver.page_source)