Selenium Web Scraping：按文本查找元素在脚本中不起作用

Question

我正在编写一个脚本来从 Newegg 收集信息，以查看显卡价格随时间的变化。目前，我的脚本将通过 Chromedriver 在 RTX 3080 上打开 Newegg 搜索，然后单击桌面显卡的 link 以缩小我的搜索范围。我正在努力解决的部分是开发一个 for item in range 循环，它可以让我遍历所有 8 个搜索结果页面。我知道我可以通过简单地更改 URL 中的页码来做到这一点，但由于这是我试图用来更好地学习 Relative Xpath 的练习，我想使用分页按钮来做到这一点页面底部。我知道每个按钮都应该包含“1、2、3、4 等”的内部文本。但是每当我在 for 循环中使用 text() = {item} 时，它都不会单击按钮。该脚本运行并且没有 return 任何异常，但也没有执行我想要的操作。下面我附上了页面的 HTML 以及我当前的脚本。任何建议或提示表示赞赏。

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.common.exceptions import NoSuchElementException
import pandas as pd
import time

options = Options()

PATH = 'C://Program Files (x86)//chromedriver.exe'

driver = webdriver.Chrome(PATH)

url = 'https://www.newegg.com/p/pl?d=RTX+3080'

driver.maximize_window()
driver.get(url)

card_path = '/html/body/div[8]/div[3]/section/div/div/div[1]/div/dl[1]/dd/ul[2]/li/a'
desktop_graphics_cards = driver.find_element(By.XPATH, card_path)
desktop_graphics_cards.click()
time.sleep(5)

graphics_card = []
shipping_cost = []
price = []
total_cost = []

for item in range(9):
    try:
        #next_page_click = driver.find_element(By.XPATH("//button[text() = '{item + 1}']"))
        print(next_page_click)
        next_page_click.click()
    except:
        pass

Answer 1

分页按钮超出了最初可见的区域。
要单击这些元素，您必须滚动页面直到元素出现。
此外，当您尝试使用从 1 到 9 的数字时，您将需要单击从 2 到 9（包括）的下一页按钮。
我认为这应该更好用：

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.common.exceptions import NoSuchElementException
import pandas as pd
import time

options = Options()

PATH = 'C://Program Files (x86)//chromedriver.exe'

driver = webdriver.Chrome(PATH)

url = 'https://www.newegg.com/p/pl?d=RTX+3080'
actions = ActionChains(driver)

driver.maximize_window()
driver.get(url)

card_path = '/html/body/div[8]/div[3]/section/div/div/div[1]/div/dl[1]/dd/ul[2]/li/a'
desktop_graphics_cards = driver.find_element(By.XPATH, card_path)
desktop_graphics_cards.click()
time.sleep(5)

graphics_card = []
shipping_cost = []
price = []
total_cost = []

for item in range(2,10):
    try:
        next_page_click = driver.find_element(By.XPATH(f"//button[text() = '{item}']"))
        actions.move_to_element(next_page_click).perform()
        time.sleep(2)
        #print(next_page_click) - printing a web element itself will not give you usable information
        next_page_click.click()
        #let the next page loaded, it takes some time
        time.sleep(5)
    except:
        pass

Selenium Web Scraping：按文本查找元素在脚本中不起作用

Selenium Web Scraping: Find element by text not working in script

python

selenium

xpath

for-loop

selenium-chromedriver