使用 Selenium 和 Python 使用类名和文本属性获取 Youtube 视频标题

Get Youtube video title using classname and text attribute using Selenium and Python

您好,我正在使用 Python Selenium Webdriver 来获得 Youtube 标题,但不断获得比我想要的更多的信息。 该行是: driver.find_element_by_class_name("style-scope ytd-video-primary-info-renderer").text

有什么方法可以修复它并使其更有效,以便它只显示标题。 这是我使用的测试脚本:

from selenium import webdriver as wd
from time import sleep as zz

driver = wd.Firefox(executable_path=r'./geckodriver.exe')
driver.get('https://www.youtube.com/watch?v=wma0szfIafk')
zz(4)
test_atr = driver.find_element_by_class_name("style-scope ytd-video-primary-info-renderer").text
print(test_atr)

要打印标题文本 OBI-WAN KENOBI Official Trailer (2022) Teaser 您可以使用以下任一方法 Locator Strategies:

  • 使用 css_selectorget_attribute("innerHTML"):

    print(driver.find_element(By.CSS_SELECTOR, "h1.title.style-scope.ytd-video-primary-info-renderer > yt-formatted-string.style-scope.ytd-video-primary-info-renderer").get_attribute("innerHTML"))
    
  • 使用 xpathtext 属性:

    print(driver.find_element(By.XPATH, "//h1[@class='title style-scope ytd-video-primary-info-renderer']/yt-formatted-string[@class='style-scope ytd-video-primary-info-renderer']").text)
    

理想情况下,您需要为 visibility_of_element_located() 引入 WebDriverWait,您可以使用以下任一方法 Locator Strategies:

  • 使用 CSS_SELECTORtext 属性:

    print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "h1.title.style-scope.ytd-video-primary-info-renderer > yt-formatted-string.style-scope.ytd-video-primary-info-renderer"))).text)
    
  • 使用 XPATHget_attribute("innerHTML"):

    print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//h1[@class='title style-scope ytd-video-primary-info-renderer']/yt-formatted-string[@class='style-scope ytd-video-primary-info-renderer']"))).get_attribute("innerHTML"))
    
  • 注意:您必须添加以下导入:

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    
  • 控制台输出:

    OBI-WAN KENOBI Official Trailer (2022) Teaser
    

You can find a relevant discussion in How to retrieve the text of a WebElement using Selenium - Python


参考资料

Link 到有用的文档: