用 selenium 在 python 中抓取亚马逊 url image/picture

Question

我只需要帮助在 python 中使用 selenium 在产品页面（第一张图片，屏幕上的大尺寸）上抓取 Amazon url of image/picture。例如这个产品： https://www.amazon.fr/dp/B07CG3HFPV/ref=cm_sw_r_fm_api_glt_i_2RB9QBPTQXWJ7PQQ16MZ?_encoding=UTF8&psc=1

以下是网页源代码部分：

我需要抓取 url 带有标签“src”的图像。

有人知道如何抓取吗？实际上，我有这个脚本部分，但不起作用：

url = https://www.amazon.fr/dp/B07CG3HFPV/ref=cm_sw_r_fm_api_glt_i_2RB9QBPTQXWJ7PQQ16MZ?_encoding=UTF8&psc=1

options = Options()
options.headless = True

driver = webdriver.Chrome(options=options)
driver.get(url)
import time
time.sleep(2)

actions = ActionChains(driver)

link_img = driver.find_element_by_tag_name("img").get_attribute("src")

感谢帮助

Answer 1

要在产品页面上抓取 image/picture 的亚马逊 url（第一张图片，屏幕上的大尺寸），在 python 中使用硒，您需要诱导 WebDriverWait for the and you can use either of the following :

使用CSS_SELECTOR:

print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "span.a-list-item>span.a-declarative>div.imgTagWrapper>img.a-dynamic-image"))).get_attribute("src"))

使用XPATH:

print(WebDriverWait(browser, 20).until(EC.visibility_of_element_located((By.XPATH, "//span[@class='a-list-item']/span[@class='a-declarative']/div[@class='imgTagWrapper']/img[@class='a-dynamic-image']"))).get_attribute("src"))

注意：您必须添加以下导入：

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

用 selenium 在 python 中抓取亚马逊 url image/picture

Scrape amazon url image/picture in python with selenium

python

selenium

xpath

css-selectors

webdriverwait