如何在 Python 中使用 Selenium 自动化动态加载的网页

Question

我一直在尝试使用 Selenium 在动态加载的页面上自动执行一些任务，它无法通过 xpath、它们的值、标签或其他任何东西定位动态生成的元素，就像请求页面只获取源代码一样，无需解析脚本并生成其他内容。作为 Web 浏览器扩展，它工作得很好，但当导出到 Python 模块时，它就不行了。有没有办法让它在脚本中按预期工作，就像在浏览器扩展中一样？

比如说，我们有这样一个网页：

<html>
(...)
<body>
  <app-root></app-root>
<script src="REDACTED.js" defer></script><script src="REDACTED.js" nomodule defer></script><script src="REDACTED.js" defer></script><script src="REDACTED.js" defer></script><script src="REDACTED.js" defer></script></body>
</html>

并且浏览器可以很好地解析这些脚本，显示按钮，UI 等。同样，使用 Selenium IDE 浏览器扩展与其交互就可以完成它的工作。当我将它导出到 Python 脚本时，它看起来像：

# Generated by Selenium IDE
import pytest
import time
import json
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.support import expected_conditions
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities

class TestTest1():
  def setup_method(self, method):
    self.driver = webdriver.Chrome()
    self.vars = {}
  
  def teardown_method(self, method):
    self.driver.quit()
  
  def test_test1(self):
    self.driver.get("https://REDACTED")

    # Logging in to the portal I am interacting with - works fine
    logform = self.driver.find_element_by_id("loginForm")
    logform.find_element_by_name("username").send_keys('REDACTED')
    logform.find_element_by_name("password").send_keys('REDACTED')
    logform.find_element_by_name("login").click()
    
    # Below code does not do anything, as it's not able to find these elements 
    # on the webpage and interact with them
    self.driver.find_element(By.CSS_SELECTOR, ".btn--cta").click()
    self.driver.find_element(By.CSS_SELECTOR, ".create-flow-btn--major > img").click()
    element = self.driver.find_element(By.CSS_SELECTOR, ".create-flow-btn--major > img")
    actions = ActionChains(self.driver)
    actions.move_to_element(element).perform()
    element = self.driver.find_element(By.CSS_SELECTOR, "body")
    actions = ActionChains(self.driver)
    actions.move_to_element(element, 0, 0).perform()
    self.driver.close()
  
p1 = TestTest1()
p1.setup_method(1)
p1.test_test1()
p1.teardown_method(1)

在运行之后，脚本停止尝试查找请求的元素，并抛出以下异常：

(...) line 242, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"css selector","selector":".btn--cta"}

哦，吴先生，我该怎么办？

Answer 1

正如我在评论中提到的，您需要使用显式等待来等待元素在页面上可见。

使用 WebDriverWait() 并等待 visibility_of_element_located() 和您的定位器。

WebDriverWait(browser, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "something")))

您需要导入以下库。

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By

如何在 Python 中使用 Selenium 自动化动态加载的网页

How to automate dynamically loaded webpage with Selenium in Python

python-3.x

selenium

webdriverwait

webautomation