如何在 Python 中使用 Selenium 自动化动态加载的网页

How to automate dynamically loaded webpage with Selenium in Python

我一直在尝试使用 Selenium 在动态加载的页面上自动执行一些任务,它无法通过 xpath、它们的值、标签或其他任何东西定位动态生成的元素,就像请求页面只获取源代码一样,无需解析脚本并生成其他内容。作为 Web 浏览器扩展,它工作得很好,但当导出到 Python 模块时,它就不行了。有没有办法让它在脚本中按预期工作,就像在浏览器扩展中一样?

比如说,我们有这样一个网页:

<html>
(...)
<body>
  <app-root></app-root>
<script src="REDACTED.js" defer></script><script src="REDACTED.js" nomodule defer></script><script src="REDACTED.js" defer></script><script src="REDACTED.js" defer></script><script src="REDACTED.js" defer></script></body>
</html>

并且浏览器可以很好地解析这些脚本,显示按钮,UI 等。同样,使用 Selenium IDE 浏览器扩展与其交互就可以完成它的工作。当我将它导出到 Python 脚本时,它看起来像:

# Generated by Selenium IDE
import pytest
import time
import json
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.support import expected_conditions
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities

class TestTest1():
  def setup_method(self, method):
    self.driver = webdriver.Chrome()
    self.vars = {}
  
  def teardown_method(self, method):
    self.driver.quit()
  
  def test_test1(self):
    self.driver.get("https://REDACTED")

    # Logging in to the portal I am interacting with - works fine
    logform = self.driver.find_element_by_id("loginForm")
    logform.find_element_by_name("username").send_keys('REDACTED')
    logform.find_element_by_name("password").send_keys('REDACTED')
    logform.find_element_by_name("login").click()
    
    # Below code does not do anything, as it's not able to find these elements 
    # on the webpage and interact with them
    self.driver.find_element(By.CSS_SELECTOR, ".btn--cta").click()
    self.driver.find_element(By.CSS_SELECTOR, ".create-flow-btn--major > img").click()
    element = self.driver.find_element(By.CSS_SELECTOR, ".create-flow-btn--major > img")
    actions = ActionChains(self.driver)
    actions.move_to_element(element).perform()
    element = self.driver.find_element(By.CSS_SELECTOR, "body")
    actions = ActionChains(self.driver)
    actions.move_to_element(element, 0, 0).perform()
    self.driver.close()
  
p1 = TestTest1()
p1.setup_method(1)
p1.test_test1()
p1.teardown_method(1)

在 运行 之后,脚本停止尝试查找请求的元素,并抛出以下异常:

(...) line 242, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"css selector","selector":".btn--cta"}

哦,吴先生,我该怎么办?

正如我在评论中提到的,您需要使用显式等待来等待元素在页面上可见。

使用 WebDriverWait() 并等待 visibility_of_element_located() 和您的定位器。

WebDriverWait(browser, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "something")))

您需要导入以下库。

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By