如何移动到 Python Selenium 的下一页?
How to move to the next page on Python Selenium?
我正在尝试为特定站点构建代理抓取工具,但无法转到下一页。
这是我正在使用的代码。
如果你回答了我的问题,请向我解释一下你使用的是什么,如果可以的话,如果有关于此类代码的好的教程,请提供一些:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
import time
options = Options()
#options.headless = True #for headless
#options.add_argument('--disable-gpu') #for headless and os win
driver = webdriver.Chrome(options=options)
driver.get("https://hidemyna.me/en/proxy-list/")
time.sleep(10) #bypass cloudflare
tbody = driver.find_element_by_tag_name("tbody")
cell = tbody.find_elements_by_tag_name("tr")
for column in cell:
column = column.text.split(" ")
print (column[0]+":"+ column[1]) #ip and port
nxt = driver.find_element_by_class_name('arrow_right')
nxt.click()
您实际上并没有点击锚 <a>
标签。要导航到下一页,您需要在 <a>
link 上 click
。
您可以像下面那样使用 find_element_by_xpath。
driver.find_element_by_xpath('//*[@id="content-section"]/section[1]/div/div[4]/ul/li[1]/a').click()
您可以使用另一个 .
建议的 css 选择器,而不是使用 xpath
要转到下一页,您可以尝试以下解决方案:
代码块:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException, WebDriverException
options = Options()
options.add_argument("start-maximized")
options.add_argument("disable-infobars")
options.add_argument("--disable-extensions")
driver = webdriver.Chrome(chrome_options=options, executable_path=r'C:\Utility\BrowserDrivers\chromedriver.exe')
driver.get('https://hidemyna.me/en/proxy-list/')
while True:
try:
driver.execute_script("return arguments[0].scrollIntoView(true);", WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//li[@class='arrow__right']/a"))))
driver.find_element_by_xpath("//li[@class='arrow__right']/a").click()
print("Navigating to Next Page")
except (TimeoutException, WebDriverException) as e:
print("Last page reached")
break
driver.quit()
控制台输出:
Navigating to Next Page
Navigating to Next Page
Navigating to Next Page
Navigating to Next Page
Navigating to Next Page
Navigating to Next Page
Navigating to Next Page
Navigating to Next Page
Navigating to Next Page
Navigating to Next Page
Navigating to Next Page
Navigating to Next Page
Navigating to Next Page
Navigating to Next Page
Navigating to Next Page
Navigating to Next Page
Navigating to Next Page
Navigating to Next Page
Navigating to Next Page
Navigating to Next Page
.
.
.
Navigating to Next Page
Last page reached
下一个按钮往往因网页而异...您必须检查该按钮并使用 xpath 或 beaufifulsoup 对其进行寻址
通常有 'next page' 和 'previous page'...将您的 xpath 添加到 'next'
我正在尝试为特定站点构建代理抓取工具,但无法转到下一页。
这是我正在使用的代码。
如果你回答了我的问题,请向我解释一下你使用的是什么,如果可以的话,如果有关于此类代码的好的教程,请提供一些:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
import time
options = Options()
#options.headless = True #for headless
#options.add_argument('--disable-gpu') #for headless and os win
driver = webdriver.Chrome(options=options)
driver.get("https://hidemyna.me/en/proxy-list/")
time.sleep(10) #bypass cloudflare
tbody = driver.find_element_by_tag_name("tbody")
cell = tbody.find_elements_by_tag_name("tr")
for column in cell:
column = column.text.split(" ")
print (column[0]+":"+ column[1]) #ip and port
nxt = driver.find_element_by_class_name('arrow_right')
nxt.click()
您实际上并没有点击锚 <a>
标签。要导航到下一页,您需要在 <a>
link 上 click
。
您可以像下面那样使用 find_element_by_xpath。
driver.find_element_by_xpath('//*[@id="content-section"]/section[1]/div/div[4]/ul/li[1]/a').click()
您可以使用另一个
要转到下一页,您可以尝试以下解决方案:
代码块:
from selenium import webdriver from selenium.webdriver.chrome.options import Options from selenium.webdriver.common.by import By from selenium.webdriver.support.ui import WebDriverWait from selenium.webdriver.support import expected_conditions as EC from selenium.common.exceptions import TimeoutException, WebDriverException options = Options() options.add_argument("start-maximized") options.add_argument("disable-infobars") options.add_argument("--disable-extensions") driver = webdriver.Chrome(chrome_options=options, executable_path=r'C:\Utility\BrowserDrivers\chromedriver.exe') driver.get('https://hidemyna.me/en/proxy-list/') while True: try: driver.execute_script("return arguments[0].scrollIntoView(true);", WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//li[@class='arrow__right']/a")))) driver.find_element_by_xpath("//li[@class='arrow__right']/a").click() print("Navigating to Next Page") except (TimeoutException, WebDriverException) as e: print("Last page reached") break driver.quit()
控制台输出:
Navigating to Next Page Navigating to Next Page Navigating to Next Page Navigating to Next Page Navigating to Next Page Navigating to Next Page Navigating to Next Page Navigating to Next Page Navigating to Next Page Navigating to Next Page Navigating to Next Page Navigating to Next Page Navigating to Next Page Navigating to Next Page Navigating to Next Page Navigating to Next Page Navigating to Next Page Navigating to Next Page Navigating to Next Page Navigating to Next Page . . . Navigating to Next Page Last page reached
下一个按钮往往因网页而异...您必须检查该按钮并使用 xpath 或 beaufifulsoup 对其进行寻址
通常有 'next page' 和 'previous page'...将您的 xpath 添加到 'next'