如何使用 lxml 从 CNN Business 下载图像

Question

我浏览过非常相似的 Whosebug 页面：Python download image with lxml 但这仍然不适用于我的情况。

我想得到一些帮助从 CNN 商业预测页面下载图像。到目前为止，这是我的代码：

MWE

import lxml.html
import requests


ticker = "AAL"
ticker = ticker.upper()
url = f"https://money.cnn.com/quote/forecast/forecast.html?symb={ticker}"

xpath = '//*[@id="wsod_forecasts"]/div[1]/div/img'

response = requests.get(url)
parsed_page = lxml.html.fromstring(response.content) # this gives a list


# from: 
# this also fails
tree = lxml.html.parse(url)
img = tree.get_element_by_id('img')
img_url = img.attrib['src']

with open('image.jpg', 'wb') as outf:
    data = requests.get(img_url).content
    outf.write(data)

问题

如何下载图片？

Answer 1

在您的 parsed_page 之后添加：

img_url = "http:"+parsed_page.xpath('//*[@id="wsod_forecasts"]/div[1]/div/img')[0].attrib['src']

或者：

img_url = "http:"+parsed_page.xpath('//*[@id="wsod_forecasts"]//img')[0].attrib['src']

然后运行你的 with open() 它应该会下载。

如何使用 lxml 从 CNN Business 下载图像

How to download image using lxml from CNN Business

html

python

xpath

lxml

beautifulsoup

MWE

问题