当最终 url 是 https 时，如何使用 python 取消（解析）url？

Question

当最后的 url 是 https 时，我正在寻求缩短（解析）python 中的 url。我看到了这个问题：How can I un-shorten a URL using python? （以及类似的其他人），但是正如对已接受答案的评论中指出的那样，此解决方案仅在 urls 未重定向到 https。

供参考，该问题中的代码（重定向到 http urls 时工作正常）是：

# This is for Py2k.  For Py3k, use http.client and urllib.parse instead, and
# use // instead of / for the division
import httplib
import urlparse

def unshorten_url(url):
    parsed = urlparse.urlparse(url)
    h = httplib.HTTPConnection(parsed.netloc)
    resource = parsed.path
    if parsed.query != "":
        resource += "?" + parsed.query
    h.request('HEAD', resource )
    response = h.getresponse()
    if response.status/100 == 3 and response.getheader('Location'):
        return unshorten_url(response.getheader('Location')) # changed to     process chains of short urls
    else:
        return url

（注意 - 出于明显的带宽原因，我希望通过仅询问文件头 [即像上面的 http-only 版本] 而不是通过询问整个页面的内容来实现）

Answer 1

您可以从 url 中获取方案，如果 parsed.scheme 是 https，则使用 HTTPSConnection。
您也可以使用请求库来非常简单地完成此操作。

>>> import requests
>>> r = requests.head('http://bit.ly/IFHzvO', allow_redirects=True)
>>> print(r.url)
https://www.google.com

当最终 url 是 https 时，如何使用 python 取消（解析）url？

How to un-shorten (resolve) a url using python, when final url is https?

python

url-shortener

python-2.7