如何使用 Selenium 和 Python 从 link 中提取动态图像?
How to extract the dynamic images from the link using Selenium and Python?
我正在尝试从 link 下载图像。既然是动态渲染图,请问如何下载呢?
现在我试图获取图像 url,但这是渲染图像我能够获取第一个 link。
代码如下:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import Select
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException
executable_path = "/usr/bin/chromedriver"
chrome_options = Options()
os.environ["webdriver.chrome.driver"] = executable_path
driver = webdriver.Chrome(executable_path=executable_path, chrome_options=chrome_options)
driver.get("https://www.macfarlanepartners.com/projects/park-fifth-mid-rise/")
driver.maximize_window()
print "Entered"
elements = driver.find_elements_by_xpath("""/html/body/div/div/div[3]/div[1]/div[1]""")
time.sleep(5)
for i in elements:
image = i.find_element_by_tag_name("img")
img_src = image.get_attribute("src")
print img_src
driver.close()
输出
我只得到第一张图片 link:
https://www.macfarlanepartners.com/wp-content/uploads/2016/08/Park-Fifth-Mid-Rise-Large-Photo-1-480x240.jpg
还有没有办法自动找到src和img标签下载图片,而不是根据不同的网站搜索xpath?
我认为你需要第二个循环:
for i in elements:
images = i.find_elements_by_tag_name("img")
for img in images:
img_src = img.get_attribute("src")
print img_src
您只能检索第一个 link 作为 定位器策略 的 find_elements_by_xpath("""/html/body/div/div/div[3]/div[1]/div[1]""")
returns 只有一个元素,这就是为什么for()
循环仅迭代一次以提取第一个 <img>
标记的 src
属性的值。
更好的解决方案是在 xpath
中包含 <img>
标签,如下所示:
elements = driver.find_elements_by_xpath("""/html/body/div/div/div[3]/div[1]/div[1]/img""")
for i in elements:
img_src = i.get_attribute("src")
print img_src
您将能够检索 3 link 如下:
https://www.macfarlanepartners.com/wp-content/uploads/2016/08/Park-Fifth-Mid-Rise-Large-Photo-1-480x240.jpg
https://www.macfarlanepartners.com/wp-content/uploads/2016/08/Park-Fifth-Mid-Rise-Large-Photo-2-480x240.jpg
https://www.macfarlanepartners.com/wp-content/uploads/2016/08/Park-Fifth-Mid-Rise-Large-Photo-3-1-480x240.jpg
我正在尝试从 link 下载图像。既然是动态渲染图,请问如何下载呢?
现在我试图获取图像 url,但这是渲染图像我能够获取第一个 link。
代码如下:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.ui import Select
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException
executable_path = "/usr/bin/chromedriver"
chrome_options = Options()
os.environ["webdriver.chrome.driver"] = executable_path
driver = webdriver.Chrome(executable_path=executable_path, chrome_options=chrome_options)
driver.get("https://www.macfarlanepartners.com/projects/park-fifth-mid-rise/")
driver.maximize_window()
print "Entered"
elements = driver.find_elements_by_xpath("""/html/body/div/div/div[3]/div[1]/div[1]""")
time.sleep(5)
for i in elements:
image = i.find_element_by_tag_name("img")
img_src = image.get_attribute("src")
print img_src
driver.close()
输出
我只得到第一张图片 link:
https://www.macfarlanepartners.com/wp-content/uploads/2016/08/Park-Fifth-Mid-Rise-Large-Photo-1-480x240.jpg
还有没有办法自动找到src和img标签下载图片,而不是根据不同的网站搜索xpath?
我认为你需要第二个循环:
for i in elements:
images = i.find_elements_by_tag_name("img")
for img in images:
img_src = img.get_attribute("src")
print img_src
您只能检索第一个 link 作为 定位器策略 的 find_elements_by_xpath("""/html/body/div/div/div[3]/div[1]/div[1]""")
returns 只有一个元素,这就是为什么for()
循环仅迭代一次以提取第一个 <img>
标记的 src
属性的值。
更好的解决方案是在 xpath
中包含 <img>
标签,如下所示:
elements = driver.find_elements_by_xpath("""/html/body/div/div/div[3]/div[1]/div[1]/img""")
for i in elements:
img_src = i.get_attribute("src")
print img_src
您将能够检索 3 link 如下:
https://www.macfarlanepartners.com/wp-content/uploads/2016/08/Park-Fifth-Mid-Rise-Large-Photo-1-480x240.jpg
https://www.macfarlanepartners.com/wp-content/uploads/2016/08/Park-Fifth-Mid-Rise-Large-Photo-2-480x240.jpg
https://www.macfarlanepartners.com/wp-content/uploads/2016/08/Park-Fifth-Mid-Rise-Large-Photo-3-1-480x240.jpg