Python WebScraper - 对象没有属性 'urlretrieve'
Python WebScraper - object has no attribute 'urlretrieve'
我正在尝试创建一个 python 网络爬虫,它可以从 url 下载一定数量的图像到我的当前目录。但是对于以下代码:
urllib.request.urlretrieve(each, filename)
意思是:AttributeError: 'function' object has no attribute 'urlretrieve' when 运行 the program
完整代码如下:
from urllib.request import urlopen
from bs4 import BeautifulSoup as soup
url = 'https://unsplash.com/s/photos/download'
def download_imgs(url, amountOfImgs):
html = urlopen(url).read()
#parsing the html from the url
page_soup = soup(html, "html.parser")
images = [img for img in page_soup.findAll('img')]
counter = 0
#compiling the unicode list of image links
image_links = [each.get('src') for each in images]
for each in image_links:
if(counter <= amountOfImgs):
filename = each.split('/')[-1]
urllib.request.urlretrieve(each, filename)
counter += 1
else:
return image_links
print(download_imgs(url, 5))
当您只导入 URLOpen 时,您似乎错过了其他所有内容。
我做的有点不同,我使用 requests.get 方法得到了 html,并且不需要打开 url,你可以这样做
导入 url打开,url检索
如果你想用我的,我知道它有效,
import urllib.request
from bs4 import BeautifulSoup as soup
import requests
url = 'https://unsplash.com/s/photos/download'
def download_imgs(url, amountOfImgs):
req=requests.get(url)
html=req.text
#parsing the html from the url
page_soup = soup(html, "html.parser")
images = [img for img in page_soup.findAll('img')]
counter = 0
#compiling the unicode list of image links
image_links = [each.get('src') for each in images]
for each in image_links:
if(counter <= amountOfImgs):
filename = each.split('/')[-1]
urllib.request.urlretrieve(each, filename)
counter += 1
else:
return image_links
print(download_imgs(url, 5))
我正在尝试创建一个 python 网络爬虫,它可以从 url 下载一定数量的图像到我的当前目录。但是对于以下代码:
urllib.request.urlretrieve(each, filename)
意思是:AttributeError: 'function' object has no attribute 'urlretrieve' when 运行 the program
完整代码如下:
from urllib.request import urlopen
from bs4 import BeautifulSoup as soup
url = 'https://unsplash.com/s/photos/download'
def download_imgs(url, amountOfImgs):
html = urlopen(url).read()
#parsing the html from the url
page_soup = soup(html, "html.parser")
images = [img for img in page_soup.findAll('img')]
counter = 0
#compiling the unicode list of image links
image_links = [each.get('src') for each in images]
for each in image_links:
if(counter <= amountOfImgs):
filename = each.split('/')[-1]
urllib.request.urlretrieve(each, filename)
counter += 1
else:
return image_links
print(download_imgs(url, 5))
当您只导入 URLOpen 时,您似乎错过了其他所有内容。
我做的有点不同,我使用 requests.get 方法得到了 html,并且不需要打开 url,你可以这样做 导入 url打开,url检索
如果你想用我的,我知道它有效,
import urllib.request
from bs4 import BeautifulSoup as soup
import requests
url = 'https://unsplash.com/s/photos/download'
def download_imgs(url, amountOfImgs):
req=requests.get(url)
html=req.text
#parsing the html from the url
page_soup = soup(html, "html.parser")
images = [img for img in page_soup.findAll('img')]
counter = 0
#compiling the unicode list of image links
image_links = [each.get('src') for each in images]
for each in image_links:
if(counter <= amountOfImgs):
filename = each.split('/')[-1]
urllib.request.urlretrieve(each, filename)
counter += 1
else:
return image_links
print(download_imgs(url, 5))