Python / BeautifulSoup 图像抓取无法正确保存动画 GIF

Python / BeautifulSoup Image Scraping Does Not Save Animated GIFs Correctly

我有一段 Python 代码可以帮助我每天早上从网站上抓取一些图像 - 用于我负责的日常项目。一切正常,我可以毫无问题地获得 JPG 和 PNG。问题是动画 GIF 大多数时候都是 saved/downloaded 作为静态 GIF。有时它确实保存为动画但很少。

我对 BeautifulSoup 不是很熟悉,所以我不确定我是否做错了什么,或者 BeautifulSoup 处理动画 GIF 的方式有限制。

我使用 kickstarter url 只是为了测试目的...

import os
import sys
import requests
import urllib
import urllib.request
from bs4 import BeautifulSoup
from csv import writer

baseUrl = requests.get('https://www.kickstarter.com/projects/peak-design/travel-tripod-by-peak-design')
soup = BeautifulSoup(baseUrl.text, 'html.parser')

allImgs = soup.findAll('img')

imgCounter = 1

for img in allImgs:
    newImg = img.get('src')

    # CHECK EXTENSION
    if '.jpg' in newImg:
        extension = '.jpg'
    elif '.png' in newImg:
        extension = '.png'
    elif '.gif' in newImg:
        extension = '.gif'

    imgFile = open(str(imgCounter) + extension, 'wb')
    imgFile.write(urllib.request.urlopen(newImg).read())
    imgCounter = imgCounter + 1
    imgFile.close()

非常感谢对此问题的任何帮助或见解!!!

-S

这是对我有用的... 基本上我需要从任何 GIF 文件中获取 data-src,而不是像我对所有图像所做的那样 src

修改后的代码如下:

import os
import sys
import requests
import urllib
import urllib.request
from bs4 import BeautifulSoup
from csv import writer

baseUrl = requests.get('https://www.kickstarter.com/projects/peak-design/travel-tripod-by-peak-design')
soup = BeautifulSoup(baseUrl.text, 'html.parser')

allImgs = soup.findAll('img')

imgCounter = 1

for img in allImgs:
    newImg = img.get('data-src')
    if newImg == None:
        newImg = img.get('src')

    #CHECK EXTENSION
    if '.jpg' in newImg:
        extension = '.jpg'
    elif '.png' in newImg:
        extension = '.png'
    elif '.gif' in newImg:
        extension = '.gif'

    imgFile = open(str(imgCounter) + extension, 'wb')
    imgFile.write(urllib.request.urlopen(newImg).read())
    imgCounter = imgCounter + 1
    imgFile.close()