Python urllib.request 在 Azure 笔记本中抛出 403

Python urllib.request Throws 403 in Azure Notebooks

尝试获取 Azure Notebooks 中的 TensorFlow Object 检测 API 模型,但我尝试的所有操作都显示 403 Forbidden。在本地或 AWS 上检索文件没有问题。

import six.moves.urllib as urllib
url = 'http://download.tensorflow.org/models/object_detection/rfcn_resnet101_coco_11_06_2017.tar.gz'
opener = urllib.request.URLopener()
opener.retrieve(url)

我尝试将 User-Agent 数据添加到 header 等,但失败了。尝试使用 wget 并且也给出了 403。我认为笔记本 运行 在 docker 容器中,所以可能存在一些问题。任何见解或 work-arounds 将不胜感激。

Azure Notebooks 有意限制对外部 URL 的访问。这最有可能防止人们使用 Notebooks 服务对其他站点执行拒绝服务攻击。

https://blogs.technet.microsoft.com/machinelearning/2016/03/30/jupyter-notebooks-with-r-in-azure-ml-studio-2/

Access to external internet sites is restricted. However, we have white listed a number of important URLs:

  • All CRAN mirrors are on the white list, so you should be able to install packages using your favorite CRAN mirror.
  • Github is also white listed, meaning you can use devtools::install_github() to install packages that are not on CRAN, or get the development version of a package.