如何从具有自定义特征的 span 标签中获取数据? (BeautifulSoup)
How to get data from span tag which have custom characteristics? (BeautifulSoup)
我有以下 span 标签。我怎样才能抓取 xuRMlBoIUcI7nAJktBcJvPByp1DLE4aPGzq3JNiRKsdNqUkVSJBY%2BggxRhp0GcRx4Gw4lWQxbTk%3D
哪个分配给数据块?
<span data-ju-jspjrvxy=""
data-slug="xuRMlBoIUcI7nAJktBcJvPByp1DLE4aPGzq3JNiRKsdNqUkVSJBY%2BggxRhp0GcRx4Gw4lWQxbTk%3D"
data-gtm-clickedelement="CTA button" data-gtm-offer="" data-ju-wvxjoly-pk="303795"
data-gtm-voucher-id="303795" class="businessinsiderus-voucher-button-holder clear">
如果 s
是您的数据字符串,则使用正则表达式模块:
import re
match = re.findall('data\-slug=\"()\"',str(s))
如果我对你的问题的理解是正确的,你想抓取标签的属性。
如果这确实是您的问题,以下 link 将提供解决方案:
Extracting an attribute value with beautifulsoup
from bs4 import BeautifulSoup as BS
content = 'your html span text here'
soup = BS(content,parser='html', features='lxml')
dict_of_spantag_attributes_and_values = soup.span.attrs
for i,j in dict_of_spantag_attributes_and_values.items():
print(f'{i}:{j}')
我有以下 span 标签。我怎样才能抓取 xuRMlBoIUcI7nAJktBcJvPByp1DLE4aPGzq3JNiRKsdNqUkVSJBY%2BggxRhp0GcRx4Gw4lWQxbTk%3D
哪个分配给数据块?
<span data-ju-jspjrvxy=""
data-slug="xuRMlBoIUcI7nAJktBcJvPByp1DLE4aPGzq3JNiRKsdNqUkVSJBY%2BggxRhp0GcRx4Gw4lWQxbTk%3D"
data-gtm-clickedelement="CTA button" data-gtm-offer="" data-ju-wvxjoly-pk="303795"
data-gtm-voucher-id="303795" class="businessinsiderus-voucher-button-holder clear">
如果 s
是您的数据字符串,则使用正则表达式模块:
import re
match = re.findall('data\-slug=\"()\"',str(s))
如果我对你的问题的理解是正确的,你想抓取标签的属性。 如果这确实是您的问题,以下 link 将提供解决方案: Extracting an attribute value with beautifulsoup
from bs4 import BeautifulSoup as BS
content = 'your html span text here'
soup = BS(content,parser='html', features='lxml')
dict_of_spantag_attributes_and_values = soup.span.attrs
for i,j in dict_of_spantag_attributes_and_values.items():
print(f'{i}:{j}')