如何在Python中获取div标签中的样式值？

Question

我想抓取单个网页中的图片，图片 URL 在 div 标签中，作为样式值进行验证，如下所示：

<div class="v-image__image v-image__image--cover" style="background-image: url(&quot;https://mashinbank.com/api/parse/files/7uPtEVa0plEFoNExiYHcbtL1rQnpIGnnPHVuvKKu/f3ea4910e239eb704af755c65f548e35_car.png&quot;); background-position: center center;"></div>

我想得到：https://mashinbank.com/api/parse/files/7uPtEVa0plEFoNExiYHcbtL1rQnpIGnnPHVuvKKu/f3ea4910e239eb704af755c65f548e35_car.png

但是当我尝试 chrome 驱动程序查找元素或 soup.find 它们 return 空列表时，那是因为 div 标签之间的文本什么都没有。

我正在寻找一种方法让进入 div 标签，而不是介于两者之间。

Answer 1

Selenium针对此问题的解决方案如下：您可能应该等待元素可见性，然后才提取元素属性。
拆分整个样式属性以获得 url 值。
像这样：

wait = WebDriverWait(driver, 20)
element = wait.until(EC.visibility_of_element_located((By.XPATH, "//div[contains(@style,'https://mashinbank.com/api/parse/files')]")))
style_content = element.get_attribute("style")
url = style_content.split(";")[1]

Answer 2

要获得您应该获得的所有 images，use presence of all the elements.

一旦你在 Python 中有了 list，例如 all_images（见下文），你可以 remove the () 和 "" 如下所示。

示例代码：

driver = webdriver.Chrome(driver_path)
driver.maximize_window()
#driver.implicitly_wait(50)
wait = WebDriverWait(driver, 20)
links = []
driver.get("https://mashinbank.com/ad/GkbI20tzp3/%D8%AE%D8%B1%DB%8C%D8%AF-%D9%BE%D8%B1%D8%A7%DB%8C%D8%AF-111-SE-1397")
all_images = WebDriverWait(driver, 20).until(EC.presence_of_all_elements_located((By.XPATH, "//div[contains(@class,'v-image__image--cover')]")))
for image in all_images:
    a = image.get_attribute('style')
    b = a.split("(")[1].split(")")[0].replace('"', '')
    links.append(b)

print(links)

进口：

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

输出：

['https://mashinbank.com/api/parse/files/7uPtEVa0plEFoNExiYHcbtL1rQnpIGnnPHVuvKKu/cabdf9f3f379e5b839300f89a90ab27e_car.png', 'https://mashinbank.com/api/parse/files/7uPtEVa0plEFoNExiYHcbtL1rQnpIGnnPHVuvKKu/e1c6c75dda980a6b4b4a83932ed49832_car.png', 'https://mashinbank.com/api/parse/files/7uPtEVa0plEFoNExiYHcbtL1rQnpIGnnPHVuvKKu/81ef7c57ca349485a9ba78bf0e42e13f_car.png', 'https://mashinbank.com/api/parse/files/7uPtEVa0plEFoNExiYHcbtL1rQnpIGnnPHVuvKKu/02bd13f2c5ce936ec3db10706c03854d_car.png', 'https://mashinbank.com/api/parse/files/7uPtEVa0plEFoNExiYHcbtL1rQnpIGnnPHVuvKKu/cabdf9f3f379e5b839300f89a90ab27e_car.png', 'https://mashinbank.com/api/parse/files/7uPtEVa0plEFoNExiYHcbtL1rQnpIGnnPHVuvKKu/e1c6c75dda980a6b4b4a83932ed49832_car.png', 'https://mashinbank.com/api/parse/files/7uPtEVa0plEFoNExiYHcbtL1rQnpIGnnPHVuvKKu/81ef7c57ca349485a9ba78bf0e42e13f_car.png', 'https://mashinbank.com/api/parse/files/7uPtEVa0plEFoNExiYHcbtL1rQnpIGnnPHVuvKKu/02bd13f2c5ce936ec3db10706c03854d_car.png']

如何在Python中获取div标签中的样式值？

How to get the style value in a div tag in Python?

html

python

beautifulsoup

web-crawler

selenium-chromedriver