urllib.request.urlretrieve不返回怎么办

Question

python 文档中提到 urllib.request.urlretrieve return 是一个元组，将用于打开文件，如下面的代码 A 所示。

然而在示例代码中-B。 urllib.request.urlretrieve 不会 return 但没有它代码将失败。请帮助说明 urllib.request.urlretrieve 在代码 B 中做了什么。谢谢

代码A

import urllib.request
>>> local_filename, headers = urllib.request.urlretrieve('http://python.org/')
>>> html = open(local_filename)
>>> html.close()

代码 B

import os
import tarfile
from six.moves import urllib

DOWNLOAD_ROOT = "https://raw.githubusercontent.com/ageron/handson-ml2/master/"
HOUSING_PATH = os.path.join("datasets", "housing") # datasets\housing
HOUSING_URL = DOWNLOAD_ROOT + "datasets/housing/housing.tgz"

def fetch_housing_data(housing_url=HOUSING_URL, housing_path=HOUSING_PATH):
    if not os.path.isdir(housing_path):
            os.makedirs(housing_path)
    tgz_path = os.path.join(housing_path, "housing.tgz") #datasets\housing\housing.tgz
    urllib.request.urlretrieve(housing_url, tgz_path) #what does this code here do?
    housing_tgz = tarfile.open(tgz_path)
    housing_tgz.extractall(path=housing_path)
    housing_tgz.close()

Answer 1

retrieve() 方法用于保存来自 url 的网页内容（例如 csv、图像等）在您的情况下，它保存保存在 url 中的房屋数据。您可以在 [此处][1]

查看文档

   tgz_path = os.path.join(housing_path, "housing.tgz") #<--- is the path directory

  # takes 2 parameters the url and the file path to save the content 
      urllib.request.urlretrieve( housing_url, tgz_path) 


  [1]: https://docs.python.org/3/library/urllib.request.html

Answer 2

在第二个代码中，通过指定filename，这将自动将内容保存在本地定义的路径中。在这种情况下，这是 tgz_path.

我不确定你所说的失败是什么意思。总是返回一个元组。问题是它是否存储在内存中。例如，以下内容仍然有效：

In [1]: import urllib.request                                                                                                                       

In [2]: urllib.request.urlretrieve('http://python.org/', 'test.python')                                                                             
Out[2]: ('test.python', <http.client.HTTPMessage at 0x108d22390>)

urllib.request.urlretrieve不返回怎么办

What does urllib.request.urlretrieve do if not returned

python

urllib