使用 Selenium 和 Python 使用类名和文本属性获取 Youtube 视频标题
Get Youtube video title using classname and text attribute using Selenium and Python
您好,我正在使用 Python Selenium Webdriver 来获得 Youtube 标题,但不断获得比我想要的更多的信息。
该行是:
driver.find_element_by_class_name("style-scope ytd-video-primary-info-renderer").text
有什么方法可以修复它并使其更有效,以便它只显示标题。
这是我使用的测试脚本:
from selenium import webdriver as wd
from time import sleep as zz
driver = wd.Firefox(executable_path=r'./geckodriver.exe')
driver.get('https://www.youtube.com/watch?v=wma0szfIafk')
zz(4)
test_atr = driver.find_element_by_class_name("style-scope ytd-video-primary-info-renderer").text
print(test_atr)
要打印标题文本 OBI-WAN KENOBI Official Trailer (2022) Teaser
您可以使用以下任一方法 Locator Strategies:
使用 css_selector
和 get_attribute("innerHTML")
:
print(driver.find_element(By.CSS_SELECTOR, "h1.title.style-scope.ytd-video-primary-info-renderer > yt-formatted-string.style-scope.ytd-video-primary-info-renderer").get_attribute("innerHTML"))
使用 xpath
和 text 属性:
print(driver.find_element(By.XPATH, "//h1[@class='title style-scope ytd-video-primary-info-renderer']/yt-formatted-string[@class='style-scope ytd-video-primary-info-renderer']").text)
理想情况下,您需要为 visibility_of_element_located()
引入 WebDriverWait,您可以使用以下任一方法 Locator Strategies:
使用 CSS_SELECTOR
和 text 属性:
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "h1.title.style-scope.ytd-video-primary-info-renderer > yt-formatted-string.style-scope.ytd-video-primary-info-renderer"))).text)
使用 XPATH
和 get_attribute("innerHTML")
:
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//h1[@class='title style-scope ytd-video-primary-info-renderer']/yt-formatted-string[@class='style-scope ytd-video-primary-info-renderer']"))).get_attribute("innerHTML"))
注意:您必须添加以下导入:
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
控制台输出:
OBI-WAN KENOBI Official Trailer (2022) Teaser
You can find a relevant discussion in How to retrieve the text of a WebElement using Selenium - Python
参考资料
Link 到有用的文档:
get_attribute()
方法Gets the given attribute or property of the element.
text
属性returnsThe text of the element.
- Difference between text and innerHTML using Selenium
您好,我正在使用 Python Selenium Webdriver 来获得 Youtube 标题,但不断获得比我想要的更多的信息。
该行是:
driver.find_element_by_class_name("style-scope ytd-video-primary-info-renderer").text
有什么方法可以修复它并使其更有效,以便它只显示标题。 这是我使用的测试脚本:
from selenium import webdriver as wd
from time import sleep as zz
driver = wd.Firefox(executable_path=r'./geckodriver.exe')
driver.get('https://www.youtube.com/watch?v=wma0szfIafk')
zz(4)
test_atr = driver.find_element_by_class_name("style-scope ytd-video-primary-info-renderer").text
print(test_atr)
要打印标题文本 OBI-WAN KENOBI Official Trailer (2022) Teaser
您可以使用以下任一方法 Locator Strategies:
使用
css_selector
和get_attribute("innerHTML")
:print(driver.find_element(By.CSS_SELECTOR, "h1.title.style-scope.ytd-video-primary-info-renderer > yt-formatted-string.style-scope.ytd-video-primary-info-renderer").get_attribute("innerHTML"))
使用
xpath
和 text 属性:print(driver.find_element(By.XPATH, "//h1[@class='title style-scope ytd-video-primary-info-renderer']/yt-formatted-string[@class='style-scope ytd-video-primary-info-renderer']").text)
理想情况下,您需要为 visibility_of_element_located()
引入 WebDriverWait,您可以使用以下任一方法 Locator Strategies:
使用
CSS_SELECTOR
和 text 属性:print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "h1.title.style-scope.ytd-video-primary-info-renderer > yt-formatted-string.style-scope.ytd-video-primary-info-renderer"))).text)
使用
XPATH
和get_attribute("innerHTML")
:print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//h1[@class='title style-scope ytd-video-primary-info-renderer']/yt-formatted-string[@class='style-scope ytd-video-primary-info-renderer']"))).get_attribute("innerHTML"))
注意:您必须添加以下导入:
from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.common.by import By from selenium.webdriver.support import expected_conditions as EC
控制台输出:
OBI-WAN KENOBI Official Trailer (2022) Teaser
You can find a relevant discussion in How to retrieve the text of a WebElement using Selenium - Python
参考资料
Link 到有用的文档:
get_attribute()
方法Gets the given attribute or property of the element.
text
属性returnsThe text of the element.
- Difference between text and innerHTML using Selenium