python 中的 f 字符串是否有替代方案?
Is there any alternative for \ in f string in python?
所以我正在用 link 抓取这个网站:https://www.americanexpress.com/in/credit-cards/payback-card/
使用美丽的汤和 python.
link = 'https://www.americanexpress.com/in/credit-cards/payback-card/'
html = urlopen(link)
soup = BeautifulSoup(html, 'lxml')
details = []
for span in soup.select(".why-amex__subtitle span"):
details.append(f'{span.get_text(strip=True)}: {span.find_next("span").get_text(strip=True)}')
print(details)
输出:
['EARN POINTS: Earn multiple Points from more than 50 PAYBACK partners2and 2 PAYBACK Points from American\xa0Express PAYBACK Credit\xa0Card for every Rs.\xa0100 spent', 'WELCOME GIFT: Get Flipkart voucher worth Rs. 7503on taking 3 transactions within 60 days of Cardmembership', 'MILESTONE BENEFITS: Flipkart vouchers4worth Rs. 7,000 on spending Rs. 2.5 lacs in a Cardmembership yearYou will earn a Flipkart voucher4worth Rs. 2,000 on spending Rs. 1.25 lacs in a Cardmembership year. Additionally, you will earn a Flipkart voucher4worth Rs. 5,000 on spending Rs. 2.5 lacs in a Cardmembership year.']
正如您在输出中看到的那样,有 \xa0 个字符要从字符串中删除。
我尝试使用 replace 函数,但它不适用于 f 字符串,因为涉及到 \。
details.append(f'{span.get_text(strip=True)}: {span.find_next("span").get_text(strip=True).replace("\xa0","")}')
有没有其他方法可以解决这个问题?
非常感谢任何帮助!!!
这可能是一个临时解决方案,因为 .replace("\xa0","")
不能在内部工作之前在外部进行更改:
link = 'https://www.americanexpress.com/in/credit-cards/payback-card/'
html = urlopen(link)
soup = BeautifulSoup(html, 'lxml')
details = []
for span in soup.select(".why-amex__subtitle span"):
element = span.get_text(strip=True).replace("\xa0","")
next_element = span.find_next("span").get_text(strip=True).replace("\xa0","")
details.append(f'{element}: {next_element}')
print(details)
您可以使用 unicodedata
删除 \xa0
个字符。当包含在 f 字符串中时,它不会 运行,但是这样做:
from bs4 import BeautifulSoup
from unicodedata import normalize
link = 'https://www.americanexpress.com/in/credit-cards/payback-card/'
html = urlopen(link)
soup = BeautifulSoup(html, 'lxml')
details = []
for span in soup.select(".why-amex__subtitle span"):
a = normalize('NFKD', span.get_text(strip=True))
b = normalize('NFKD',span.find_next("span").get_text(strip=True))
details.append(f'{a}: {b}')
print(details)
所以我正在用 link 抓取这个网站:https://www.americanexpress.com/in/credit-cards/payback-card/ 使用美丽的汤和 python.
link = 'https://www.americanexpress.com/in/credit-cards/payback-card/'
html = urlopen(link)
soup = BeautifulSoup(html, 'lxml')
details = []
for span in soup.select(".why-amex__subtitle span"):
details.append(f'{span.get_text(strip=True)}: {span.find_next("span").get_text(strip=True)}')
print(details)
输出:
['EARN POINTS: Earn multiple Points from more than 50 PAYBACK partners2and 2 PAYBACK Points from American\xa0Express PAYBACK Credit\xa0Card for every Rs.\xa0100 spent', 'WELCOME GIFT: Get Flipkart voucher worth Rs. 7503on taking 3 transactions within 60 days of Cardmembership', 'MILESTONE BENEFITS: Flipkart vouchers4worth Rs. 7,000 on spending Rs. 2.5 lacs in a Cardmembership yearYou will earn a Flipkart voucher4worth Rs. 2,000 on spending Rs. 1.25 lacs in a Cardmembership year. Additionally, you will earn a Flipkart voucher4worth Rs. 5,000 on spending Rs. 2.5 lacs in a Cardmembership year.']
正如您在输出中看到的那样,有 \xa0 个字符要从字符串中删除。
我尝试使用 replace 函数,但它不适用于 f 字符串,因为涉及到 \。
details.append(f'{span.get_text(strip=True)}: {span.find_next("span").get_text(strip=True).replace("\xa0","")}')
有没有其他方法可以解决这个问题?
非常感谢任何帮助!!!
这可能是一个临时解决方案,因为 .replace("\xa0","")
不能在内部工作之前在外部进行更改:
link = 'https://www.americanexpress.com/in/credit-cards/payback-card/'
html = urlopen(link)
soup = BeautifulSoup(html, 'lxml')
details = []
for span in soup.select(".why-amex__subtitle span"):
element = span.get_text(strip=True).replace("\xa0","")
next_element = span.find_next("span").get_text(strip=True).replace("\xa0","")
details.append(f'{element}: {next_element}')
print(details)
您可以使用 unicodedata
删除 \xa0
个字符。当包含在 f 字符串中时,它不会 运行,但是这样做:
from bs4 import BeautifulSoup
from unicodedata import normalize
link = 'https://www.americanexpress.com/in/credit-cards/payback-card/'
html = urlopen(link)
soup = BeautifulSoup(html, 'lxml')
details = []
for span in soup.select(".why-amex__subtitle span"):
a = normalize('NFKD', span.get_text(strip=True))
b = normalize('NFKD',span.find_next("span").get_text(strip=True))
details.append(f'{a}: {b}')
print(details)