查找微小 url 最终重定向 url
Find tiny url final redirected url
我已经按照 SO 中的其他几个问题找到最终的重定向 url,但是对于以下 url 我无法使重定向工作。它不会重定向并保持在 tinyurl。
import urllib2
def getFinalUrl(start_url):
var = urllib2.urlopen(start_url)
final_url = var.geturl()
return final_url
url = "http://redirect.tinyurl.com/api/click?key=a7e37b5f6ff1de9cb410158b1013e54a&out=http%3A%2F%2Fwww.amazon.com%2Fgp%2Fprofile%2FA3B4EO22KUPKYW&loc=&cuid=0072ce987ebb47328d22e465a051ce7&opt=false&format=txt"
redirect = getFinalUrl(url)
print "redirect: " + redirect
结果(如果您在浏览器中尝试,这不是最终的 url):
redirect: http://redirect.tinyurl.com/api/click?key=a7e37b5f6ff1de9cb410158b1013e54a&out=http%3A%2F%2Fwww.amazon.com%2Fgp%2Fprofile%2FA3B4EO22KUPKYW&loc=&cuid=0072ce987ebb47328d22e465a051ce7&opt=false&format=txt
import urlparse
url = 'http://redirect.tinyurl.com/api/click?key=a7e37b5f6ff1de9cb410158b1013e54a&out=http%3A%2F%2Fwww.amazon.com%2Fgp%2Fprofile%2FA3B4EO22KUPKYW&loc=&cuid=0072ce987ebb47328d22e465a051ce7&opt=false&format=txt'
try:
out = urlparse.parse_qs(urlparse.urlparse(url).query)['out'][0]
print(out) #http://www.amazon.com/gp/profile/A3B4EO22KUPKYW
except Exception as e: # dont catch all
print('not found')
这种 url 不需要 curled 来找出 destination/redirect url 是什么,好吧,因为你已经有了它们你的 url.
要是destination/redirecturl不显示像这家伙
tinyurl.com/xxxx
那就是另外一回事了,你必须要 curl 它才能找出它 resolves/304 喜欢下面的内容:
import requests
url = 'http://urlshortener.com/applebanana'
t = requests.get(url)
print(t.url)
我已经按照 SO 中的其他几个问题找到最终的重定向 url,但是对于以下 url 我无法使重定向工作。它不会重定向并保持在 tinyurl。
import urllib2
def getFinalUrl(start_url):
var = urllib2.urlopen(start_url)
final_url = var.geturl()
return final_url
url = "http://redirect.tinyurl.com/api/click?key=a7e37b5f6ff1de9cb410158b1013e54a&out=http%3A%2F%2Fwww.amazon.com%2Fgp%2Fprofile%2FA3B4EO22KUPKYW&loc=&cuid=0072ce987ebb47328d22e465a051ce7&opt=false&format=txt"
redirect = getFinalUrl(url)
print "redirect: " + redirect
结果(如果您在浏览器中尝试,这不是最终的 url):
redirect: http://redirect.tinyurl.com/api/click?key=a7e37b5f6ff1de9cb410158b1013e54a&out=http%3A%2F%2Fwww.amazon.com%2Fgp%2Fprofile%2FA3B4EO22KUPKYW&loc=&cuid=0072ce987ebb47328d22e465a051ce7&opt=false&format=txt
import urlparse
url = 'http://redirect.tinyurl.com/api/click?key=a7e37b5f6ff1de9cb410158b1013e54a&out=http%3A%2F%2Fwww.amazon.com%2Fgp%2Fprofile%2FA3B4EO22KUPKYW&loc=&cuid=0072ce987ebb47328d22e465a051ce7&opt=false&format=txt'
try:
out = urlparse.parse_qs(urlparse.urlparse(url).query)['out'][0]
print(out) #http://www.amazon.com/gp/profile/A3B4EO22KUPKYW
except Exception as e: # dont catch all
print('not found')
这种 url 不需要 curled 来找出 destination/redirect url 是什么,好吧,因为你已经有了它们你的 url.
要是destination/redirecturl不显示像这家伙
tinyurl.com/xxxx
那就是另外一回事了,你必须要 curl 它才能找出它 resolves/304 喜欢下面的内容:
import requests
url = 'http://urlshortener.com/applebanana'
t = requests.get(url)
print(t.url)