使用 python 3.x 提取部分 URL

Question

我正试图从我的 URL:

中提取 ICID 位

url = "https://secure.melrosed.co.uk/customer/secure/checkout/?productId=nbrdk4rzgj6tuhtduzhgobuobvwgytm&offerId=freetrial-digitalgold-month-SS5011&campaignId=099A&ICID=secondary_pricing_goldplus_cust_paymentpage_anon&redirectTo=https%3A%2F%2Fwww.melrosed.co.uk%2Fcosmetics%2Ffeatures%2Fgreat-product-search-lotion-steve-treats-famous-product%2F"

所以我真正追求的是：

ICID=secondary_pricing_goldplus_cust_paymentpage_anon

我正在尝试执行以下操作，但显然它不起作用（任何帮助将不胜感激。此 ICID 位可以在 URL 中的任何位置 - 开头、中间或结尾：

from urllib.parse import urlparse
url = "https://secure.melrosed.co.uk/customer/secure/checkout/?productId=nbrdk4rzgj6tuhtduzhgobuobvwgytm&offerId=freetrial-digitalgold-month-SS5011&campaignId=099A&ICID=secondary_pricing_goldplus_cust_paymentpage_anon&redirectTo=https%3A%2F%2Fwww.melrosed.co.uk%2Fcosmetics%2Ffeatures%2Fgreat-product-search-lotion-steve-treats-famous-product%2F"
obj = urlparse(url)
print(obj)
query = obj.query
print (query)
path_list = query.split("ICID(.+?)&")
print (path_list)

Answer 1

我会通过在“&”符号处拆分 url 字符串来解决这个问题：

url = "https://secure.melrosed.co.uk/customer/secure/checkout/?productId=nbrdk4rzgj6tuhtduzhgobuobvwgytm&offerId=freetrial-digitalgold-month-SS5011&campaignId=099A&ICID=secondary_pricing_goldplus_cust_paymentpage_anon&redirectTo=https%3A%2F%2Fwww.melrosed.co.uk%2Fcosmetics%2Ffeatures%2Fgreat-product-search-lotion-steve-treats-famous-product%2F"

split_list=url.split("&")

# iterate through the list and get the sequence which has the "ICID" string in it
for i in split_list:
    if "ICID" in i:
        # i will be you final string
        print(i)

Answer 2

您已经完成了一半 - 使用 urlparse 是正确的第一步，但随后您想使用 parse_qs（也来自 urllib.parse）来解析查询：

from urllib.parse import urlparse, parse_qs
url = "https://secure.melrosed.co.uk/customer/secure/checkout/?productId=nbrdk4rzgj6tuhtduzhgobuobvwgytm&offerId=freetrial-digitalgold-month-SS5011&campaignId=099A&ICID=secondary_pricing_goldplus_cust_paymentpage_anon&redirectTo=https%3A%2F%2Fwww.melrosed.co.uk%2Fcosmetics%2Ffeatures%2Fgreat-product-search-lotion-steve-treats-famous-product%2F"
query = urlparse(url).query
path_list = parse_qs(query)['ICID']

输出：

>>> print(path_list)
['secondary_pricing_goldplus_cust_paymentpage_anon']

使用 python 3.x 提取部分 URL

Extract part of the URL using python 3.x

python

urllib