InvalidArgumentException:消息:无效参数:'url' 必须是使用 get() 调用 url 的字符串
InvalidArgumentException: Message: invalid argument: 'url' must be a string invoking url using get()
首先,我得到了所有页面的总数 url。但是,当我想进入每一页(一页一页)时,它失败了。我怎样才能进入每个页面?
!pip install selenium
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import urllib as url
from urllib.parse import urlparse
import time
browser = webdriver.Chrome(executable_path='./chromedriver.exe')
wait = WebDriverWait(browser,5)
output = []
for i in range(1,2): # Iterate from page 1 to the last page
browser.get("https://tw.mall.yahoo.com/search/product?p=%E5%B1%88%E8%87%A3%E6%B0%8F&pg={}".format(i))
# Wait Until the product appear
wait.until(EC.presence_of_element_located((By.XPATH,"//ul[@class='gridList']")))
# Get the products
product_links = browser.find_elements(By.XPATH,"//ul[@class='gridList']/li/a")
# Iterate over 'product_links' to get all the 'href' values
for link in (product_links):
print(f"{link.get_attribute('href')}")
output.append([link.get_attribute('href')])
for b in output:
browser.get(b)
输出
InvalidArgumentException: Message: invalid argument: 'url' must be a string
(Session info: chrome=96.0.4664.45)
i
属于 类型 int
。您需要将 int
转换为 string
以应用 使用 str()
函数如下:
format(str(i))
解决方案
所以有效的代码行将是:
for i in range(1,2): # Iterate from page 1 to the last page
browser.get("https://tw.mall.yahoo.com/search/product?p=%E5%B1%88%E8%87%A3%E6%B0%8F&pg={}".format(str(i)))
首先,我得到了所有页面的总数 url。但是,当我想进入每一页(一页一页)时,它失败了。我怎样才能进入每个页面?
!pip install selenium
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import urllib as url
from urllib.parse import urlparse
import time
browser = webdriver.Chrome(executable_path='./chromedriver.exe')
wait = WebDriverWait(browser,5)
output = []
for i in range(1,2): # Iterate from page 1 to the last page
browser.get("https://tw.mall.yahoo.com/search/product?p=%E5%B1%88%E8%87%A3%E6%B0%8F&pg={}".format(i))
# Wait Until the product appear
wait.until(EC.presence_of_element_located((By.XPATH,"//ul[@class='gridList']")))
# Get the products
product_links = browser.find_elements(By.XPATH,"//ul[@class='gridList']/li/a")
# Iterate over 'product_links' to get all the 'href' values
for link in (product_links):
print(f"{link.get_attribute('href')}")
output.append([link.get_attribute('href')])
for b in output:
browser.get(b)
输出
InvalidArgumentException: Message: invalid argument: 'url' must be a string
(Session info: chrome=96.0.4664.45)
i
属于 类型 int
。您需要将 int
转换为 string
以应用 str()
函数如下:
format(str(i))
解决方案
所以有效的代码行将是:
for i in range(1,2): # Iterate from page 1 to the last page
browser.get("https://tw.mall.yahoo.com/search/product?p=%E5%B1%88%E8%87%A3%E6%B0%8F&pg={}".format(str(i)))