Python Selenium：在抓取时循环遍历相同的元素

Question

上下文：

我正在尝试从 YouTube 频道抓取视频标题、观看次数以及上传时间等信息。它正在抓取相同的元素。

代码试验：

from selenium import webdriver
from selenium.webdriver.common.by import By

url = 'https://www.youtube.com/c/JohnWatsonRooney/videos?view=0&sort=p&flow=grid'
driver = webdriver.Chrome()
driver.get(url)

videos = driver.find_elements(by=By.CLASS_NAME, value='style-scope ytd-grid-video-renderer')

for video in videos:
  title = driver.find_element(by=By.XPATH, value='.//*[@id="video-title"]').text
  views = driver.find_element(by=By.XPATH, value='.//*[@id="metadata-line"]/span[1]').text
  when = driver.find_element(by=By.XPATH, value='.//*[@id="metadata-line"]/span[2]').text
  print(f"""Video Title: {title}\nViews: {views}\nUploaded: {when}\n -----------""")

输出

Video Title: Scrapy for Beginners - A Complete How To Example Web Scraping Project
Views: 104K views
Uploaded: 1 year ago

Video Title: Scrapy for Beginners - A Complete How To Example Web Scraping Project
Views: 104K views
Uploaded: 1 year ago

Video Title: Scrapy for Beginners - A Complete How To Example Web Scraping Project
Views: 104K views
Uploaded: 1 year ago..

Answer 1

打印 title、views 和 when 文本 website you need to for the and you can use the following :

代码块：

driver.get("https://www.youtube.com/c/JohnWatsonRooney/videos")
titles = [my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "a#video-title")))]
views = [my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "div#metadata-line > span:first-child")))]
when = [my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "div#metadata-line > span:nth-child(2)")))]
for i, j, k in zip(titles, views, when):
print(f"{i} had {j} since posted {k}")

控制台输出：

The Python Package I Wish I'd Learned Earlier had 4.5K views since posted 4 days ago
Rotate User Agents in Scrapy using custom Middleware had 1.2K views since posted 11 days ago
GO for Beginners - Web Scraping with Golang Tutorial had 2K views since posted 13 days ago
How I Scraped This HTML Table to a Python Dictionary had 3.6K views since posted 3 weeks ago
Python TYPE HINTS Explained with Examples had 2.3K views since posted 3 weeks ago
Use THIS Algorithm To Find KEYWORDS in Text - A Short Python Project had 4.2K views since posted 1 month ago
SQLModel is the Pydantic inspired Python ORM we’ve been waiting for had 3.4K views since posted 1 month ago
How to use Enumerate in Python to have a Counter in your loops had 4.6K views since posted 1 month ago
How to HIDE Your API Keys in Python Projects had 6.6K views since posted 1 month ago
How to Make 2500 HTTP Requests in 2 Seconds with Async & Await had 4.8K views since posted 2 months ago
Are You Still Using Excel? AUTOMATE it with PYTHON had 10K views since posted 2 months ago
How To Parse Data from HTML Tables Using Requests-HTML had 3.5K views since posted 2 months ago
THIS is the most common ERROR when learning Web Scraping had 3.6K views since posted 2 months ago
Learn Web Scraping With Python: Full Project - HTML, Save to CSV, Pagination had 10K views since posted 3 months ago
Turn Websites into Real Time API's with ScrapyRT had 4.4K views since posted 3 months ago
HTTPX is the ASYNC Requests I was Looking For had 5.8K views since posted 3 months ago
Research Amazon Products by Extracting Review Data had 4K views since posted 4 months ago
Web Scraping 101 - My (in)complete guide, methods, tools, how to had 5.4K views since posted 4 months ago
How to Scrape JavaScript Websites with Scrapy and Playwright had 9.8K views since posted 5 months ago
Web Scraping with Node js? Python Expert Opinion and demo had 3.3K views since posted 5 months ago
Automate Buying online using Playwright’s Codegen feature had 4.6K views since posted 5 months ago
Login and Scrape Data with Playwright and Python had 16K views since posted 5 months ago
Parse HTML with BeautifulSoup AND Scrapy had 2.6K views since posted 5 months ago
HIDING Data with JavaScript? Web Scraping Obfuscation had 3.9K views since posted 5 months ago
Following LINKS Automatically with Scrapy CrawlSpider had 6.2K views since posted 6 months ago
Web Scraping Weather Data with Python had 13K views since posted 6 months ago
How I Try My CODE and Test My SELECTORS in SCRAPY had 1.8K views since posted 6 months ago
How To Handle Errors & Exceptions with Requests and Python had 2.8K views since posted 6 months ago
A Short and SIMPLE HTML Web Scraper in 6 lines of CODE had 1.9K views since posted 7 months ago
Failed Requests? Try this RETRY Decorator for your Web Scraper had 3.5K views since posted 7 months ago

注意：您必须添加以下导入：

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

Python Selenium：在抓取时循环遍历相同的元素

Python Selenium: Looping over the same element while scraping

python

selenium

css-selectors

web-scraping

webdriverwait

上下文：

输出