使用 python 3.x 提取部分 URL
Extract part of the URL using python 3.x
我正试图从我的 URL:
中提取 ICID 位
url = "https://secure.melrosed.co.uk/customer/secure/checkout/?productId=nbrdk4rzgj6tuhtduzhgobuobvwgytm&offerId=freetrial-digitalgold-month-SS5011&campaignId=099A&ICID=secondary_pricing_goldplus_cust_paymentpage_anon&redirectTo=https%3A%2F%2Fwww.melrosed.co.uk%2Fcosmetics%2Ffeatures%2Fgreat-product-search-lotion-steve-treats-famous-product%2F"
所以我真正追求的是:
ICID=secondary_pricing_goldplus_cust_paymentpage_anon
我正在尝试执行以下操作,但显然它不起作用(任何帮助将不胜感激。此 ICID 位可以在 URL 中的任何位置 - 开头、中间或结尾:
from urllib.parse import urlparse
url = "https://secure.melrosed.co.uk/customer/secure/checkout/?productId=nbrdk4rzgj6tuhtduzhgobuobvwgytm&offerId=freetrial-digitalgold-month-SS5011&campaignId=099A&ICID=secondary_pricing_goldplus_cust_paymentpage_anon&redirectTo=https%3A%2F%2Fwww.melrosed.co.uk%2Fcosmetics%2Ffeatures%2Fgreat-product-search-lotion-steve-treats-famous-product%2F"
obj = urlparse(url)
print(obj)
query = obj.query
print (query)
path_list = query.split("ICID(.+?)&")
print (path_list)
我会通过在“&”符号处拆分 url 字符串来解决这个问题:
url = "https://secure.melrosed.co.uk/customer/secure/checkout/?productId=nbrdk4rzgj6tuhtduzhgobuobvwgytm&offerId=freetrial-digitalgold-month-SS5011&campaignId=099A&ICID=secondary_pricing_goldplus_cust_paymentpage_anon&redirectTo=https%3A%2F%2Fwww.melrosed.co.uk%2Fcosmetics%2Ffeatures%2Fgreat-product-search-lotion-steve-treats-famous-product%2F"
split_list=url.split("&")
# iterate through the list and get the sequence which has the "ICID" string in it
for i in split_list:
if "ICID" in i:
# i will be you final string
print(i)
您已经完成了一半 - 使用 urlparse
是正确的第一步,但随后您想使用 parse_qs
(也来自 urllib.parse
)来解析查询:
from urllib.parse import urlparse, parse_qs
url = "https://secure.melrosed.co.uk/customer/secure/checkout/?productId=nbrdk4rzgj6tuhtduzhgobuobvwgytm&offerId=freetrial-digitalgold-month-SS5011&campaignId=099A&ICID=secondary_pricing_goldplus_cust_paymentpage_anon&redirectTo=https%3A%2F%2Fwww.melrosed.co.uk%2Fcosmetics%2Ffeatures%2Fgreat-product-search-lotion-steve-treats-famous-product%2F"
query = urlparse(url).query
path_list = parse_qs(query)['ICID']
输出:
>>> print(path_list)
['secondary_pricing_goldplus_cust_paymentpage_anon']
我正试图从我的 URL:
中提取 ICID 位url = "https://secure.melrosed.co.uk/customer/secure/checkout/?productId=nbrdk4rzgj6tuhtduzhgobuobvwgytm&offerId=freetrial-digitalgold-month-SS5011&campaignId=099A&ICID=secondary_pricing_goldplus_cust_paymentpage_anon&redirectTo=https%3A%2F%2Fwww.melrosed.co.uk%2Fcosmetics%2Ffeatures%2Fgreat-product-search-lotion-steve-treats-famous-product%2F"
所以我真正追求的是:
ICID=secondary_pricing_goldplus_cust_paymentpage_anon
我正在尝试执行以下操作,但显然它不起作用(任何帮助将不胜感激。此 ICID 位可以在 URL 中的任何位置 - 开头、中间或结尾:
from urllib.parse import urlparse
url = "https://secure.melrosed.co.uk/customer/secure/checkout/?productId=nbrdk4rzgj6tuhtduzhgobuobvwgytm&offerId=freetrial-digitalgold-month-SS5011&campaignId=099A&ICID=secondary_pricing_goldplus_cust_paymentpage_anon&redirectTo=https%3A%2F%2Fwww.melrosed.co.uk%2Fcosmetics%2Ffeatures%2Fgreat-product-search-lotion-steve-treats-famous-product%2F"
obj = urlparse(url)
print(obj)
query = obj.query
print (query)
path_list = query.split("ICID(.+?)&")
print (path_list)
我会通过在“&”符号处拆分 url 字符串来解决这个问题:
url = "https://secure.melrosed.co.uk/customer/secure/checkout/?productId=nbrdk4rzgj6tuhtduzhgobuobvwgytm&offerId=freetrial-digitalgold-month-SS5011&campaignId=099A&ICID=secondary_pricing_goldplus_cust_paymentpage_anon&redirectTo=https%3A%2F%2Fwww.melrosed.co.uk%2Fcosmetics%2Ffeatures%2Fgreat-product-search-lotion-steve-treats-famous-product%2F"
split_list=url.split("&")
# iterate through the list and get the sequence which has the "ICID" string in it
for i in split_list:
if "ICID" in i:
# i will be you final string
print(i)
您已经完成了一半 - 使用 urlparse
是正确的第一步,但随后您想使用 parse_qs
(也来自 urllib.parse
)来解析查询:
from urllib.parse import urlparse, parse_qs
url = "https://secure.melrosed.co.uk/customer/secure/checkout/?productId=nbrdk4rzgj6tuhtduzhgobuobvwgytm&offerId=freetrial-digitalgold-month-SS5011&campaignId=099A&ICID=secondary_pricing_goldplus_cust_paymentpage_anon&redirectTo=https%3A%2F%2Fwww.melrosed.co.uk%2Fcosmetics%2Ffeatures%2Fgreat-product-search-lotion-steve-treats-famous-product%2F"
query = urlparse(url).query
path_list = parse_qs(query)['ICID']
输出:
>>> print(path_list)
['secondary_pricing_goldplus_cust_paymentpage_anon']