如何使用 beautifulsoup 提取 class (?) 内的属性?
How to extract attribute inside class (?) with beautifulsoup?
我一整天都在尝试从以下位置获取 'data-tippy-content' 中的值:
<span class="opacity-70">
<span class="icon-small mr-1" data-tippy-content="Niacinamide"><img loading="lazy" src="https://skinsort.com/assets/icons/thumbnail/niacinamide-def098d8bdadc1b0e2f848d8e77e6ca5d744099553ff816b09d4f3521abe0bc0.png"></span>
<span class="icon-small mr-1" data-tippy-content="Good for Oily Skin"><img loading="lazy" src="https://skinsort.com/assets/icons/thumbnail/best_oil-3587634ebcb8ef7c01c9959a4cbcbc2d85f2990a9216d24ec1081f1a931c4264.png"></span>
<span class="icon-small mr-1" data-tippy-content="Helps reduce Skin Redness"><img loading="lazy" src="https://skinsort.com/assets/icons/thumbnail/best_redness-8684f0660c6620202a754fd9bd20a0378f89d43fe2cb0fdc19a4b26af1f0b9b6.png"></span><span class="icon-small mr-1" data-tippy-content="Helps fight Acne"><img loading="lazy" src="https://skinsort.com/assets/icons/thumbnail/best_acne-82d0b5bc5978478dd67def5e721767e35db9523db64e7756343c95bfb99f842f.png"></span>
<span class="icon-small mr-1" data-tippy-content="Helps brighten skin"><img loading="lazy" src="https://skinsort.com/assets/icons/thumbnail/best_brightening-ea550c7d84d6517a76f19bab645a747913a49fe0a565f177cfa1b73f20a89d21.png"></span>
</span>
我已经尝试了很多东西,但仍然:")
我有以下代码(问题出在 'effect'):
items = soup.findAll('div', 'relative flex flex-col min-w-0 rounded-2xl break-words border-2 border-blue-gray-100 h-full lg:h-72')
ef_list = []
for i in items:
name = i.find('h4', 'mb-1 leading-tight text-blue-gray-700 group-hover:underline font-header text-xl font-bold hover:underline whitespace-normal overflow-hidden').text.strip()
function = i.find('p', 'mb-0 text-sm').text.strip()
effect = i.find('span', 'opacity-70').find('span')['data-tippy-content']
print(effect)
data.append([name, function,effect])
我希望输出类似于:
"Niacinamide, Good for Oily Skin, Helps reduce Skin Redness, Helps brighten skin"
而是:
Traceback (most recent call last):
File "d:\Scrape\scrape.py", line 36, in <module>
print(effect['data-tippy-content'])
File "C:\Users\AppData\Local\Programs\Python\Python310\lib\site-packages\bs4\element.py", line 1519, in __getitem__
return self.attrs[key]
KeyError: 'data-tippy-content'
我不知道如何获取那些“数据提示内容”(我是新手)
这里是检查的css:
HTML
您要提取的属性值为 data-tippy-content
.So simply you can call get('data-tippy-content)
to get desired data
html = '''
<span class="opacity-70">
<span class="icon-small mr-1" data-tippy-content="Niacinamide"><img loading="lazy" src="https://skinsort.com/assets/icons/thumbnail/niacinamide-def098d8bdadc1b0e2f848d8e77e6ca5d744099553ff816b09d4f3521abe0bc0.png"></span>
<span class="icon-small mr-1" data-tippy-content="Good for Oily Skin"><img loading="lazy" src="https://skinsort.com/assets/icons/thumbnail/best_oil-3587634ebcb8ef7c01c9959a4cbcbc2d85f2990a9216d24ec1081f1a931c4264.png"></span>
<span class="icon-small mr-1" data-tippy-content="Helps reduce Skin Redness"><img loading="lazy" src="https://skinsort.com/assets/icons/thumbnail/best_redness-8684f0660c6620202a754fd9bd20a0378f89d43fe2cb0fdc19a4b26af1f0b9b6.png"></span><span class="icon-small mr-1" data-tippy-content="Helps fight Acne"><img loading="lazy" src="https://skinsort.com/assets/icons/thumbnail/best_acne-82d0b5bc5978478dd67def5e721767e35db9523db64e7756343c95bfb99f842f.png"></span>
<span class="icon-small mr-1" data-tippy-content="Helps brighten skin"><img loading="lazy" src="https://skinsort.com/assets/icons/thumbnail/best_brightening-ea550c7d84d6517a76f19bab645a747913a49fe0a565f177cfa1b73f20a89d21.png"></span>
</span>
'''
from bs4 import BeautifulSoup
soup = BeautifulSoup(html,'html.parser')
txt=[x.get('data-tippy-content') for x in soup.select('.opacity-70 span')]
print(txt)
输出:
['Niacinamide', 'Good for Oily Skin', 'Helps reduce Skin Redness', 'Helps fight Acne', 'Helps
brighten skin']
我一整天都在尝试从以下位置获取 'data-tippy-content' 中的值:
<span class="opacity-70">
<span class="icon-small mr-1" data-tippy-content="Niacinamide"><img loading="lazy" src="https://skinsort.com/assets/icons/thumbnail/niacinamide-def098d8bdadc1b0e2f848d8e77e6ca5d744099553ff816b09d4f3521abe0bc0.png"></span>
<span class="icon-small mr-1" data-tippy-content="Good for Oily Skin"><img loading="lazy" src="https://skinsort.com/assets/icons/thumbnail/best_oil-3587634ebcb8ef7c01c9959a4cbcbc2d85f2990a9216d24ec1081f1a931c4264.png"></span>
<span class="icon-small mr-1" data-tippy-content="Helps reduce Skin Redness"><img loading="lazy" src="https://skinsort.com/assets/icons/thumbnail/best_redness-8684f0660c6620202a754fd9bd20a0378f89d43fe2cb0fdc19a4b26af1f0b9b6.png"></span><span class="icon-small mr-1" data-tippy-content="Helps fight Acne"><img loading="lazy" src="https://skinsort.com/assets/icons/thumbnail/best_acne-82d0b5bc5978478dd67def5e721767e35db9523db64e7756343c95bfb99f842f.png"></span>
<span class="icon-small mr-1" data-tippy-content="Helps brighten skin"><img loading="lazy" src="https://skinsort.com/assets/icons/thumbnail/best_brightening-ea550c7d84d6517a76f19bab645a747913a49fe0a565f177cfa1b73f20a89d21.png"></span>
</span>
我已经尝试了很多东西,但仍然:")
我有以下代码(问题出在 'effect'):
items = soup.findAll('div', 'relative flex flex-col min-w-0 rounded-2xl break-words border-2 border-blue-gray-100 h-full lg:h-72')
ef_list = []
for i in items:
name = i.find('h4', 'mb-1 leading-tight text-blue-gray-700 group-hover:underline font-header text-xl font-bold hover:underline whitespace-normal overflow-hidden').text.strip()
function = i.find('p', 'mb-0 text-sm').text.strip()
effect = i.find('span', 'opacity-70').find('span')['data-tippy-content']
print(effect)
data.append([name, function,effect])
我希望输出类似于:
"Niacinamide, Good for Oily Skin, Helps reduce Skin Redness, Helps brighten skin"
而是:
Traceback (most recent call last):
File "d:\Scrape\scrape.py", line 36, in <module>
print(effect['data-tippy-content'])
File "C:\Users\AppData\Local\Programs\Python\Python310\lib\site-packages\bs4\element.py", line 1519, in __getitem__
return self.attrs[key]
KeyError: 'data-tippy-content'
我不知道如何获取那些“数据提示内容”(我是新手)
这里是检查的css: HTML
您要提取的属性值为 data-tippy-content
.So simply you can call get('data-tippy-content)
to get desired data
html = '''
<span class="opacity-70">
<span class="icon-small mr-1" data-tippy-content="Niacinamide"><img loading="lazy" src="https://skinsort.com/assets/icons/thumbnail/niacinamide-def098d8bdadc1b0e2f848d8e77e6ca5d744099553ff816b09d4f3521abe0bc0.png"></span>
<span class="icon-small mr-1" data-tippy-content="Good for Oily Skin"><img loading="lazy" src="https://skinsort.com/assets/icons/thumbnail/best_oil-3587634ebcb8ef7c01c9959a4cbcbc2d85f2990a9216d24ec1081f1a931c4264.png"></span>
<span class="icon-small mr-1" data-tippy-content="Helps reduce Skin Redness"><img loading="lazy" src="https://skinsort.com/assets/icons/thumbnail/best_redness-8684f0660c6620202a754fd9bd20a0378f89d43fe2cb0fdc19a4b26af1f0b9b6.png"></span><span class="icon-small mr-1" data-tippy-content="Helps fight Acne"><img loading="lazy" src="https://skinsort.com/assets/icons/thumbnail/best_acne-82d0b5bc5978478dd67def5e721767e35db9523db64e7756343c95bfb99f842f.png"></span>
<span class="icon-small mr-1" data-tippy-content="Helps brighten skin"><img loading="lazy" src="https://skinsort.com/assets/icons/thumbnail/best_brightening-ea550c7d84d6517a76f19bab645a747913a49fe0a565f177cfa1b73f20a89d21.png"></span>
</span>
'''
from bs4 import BeautifulSoup
soup = BeautifulSoup(html,'html.parser')
txt=[x.get('data-tippy-content') for x in soup.select('.opacity-70 span')]
print(txt)
输出:
['Niacinamide', 'Good for Oily Skin', 'Helps reduce Skin Redness', 'Helps fight Acne', 'Helps
brighten skin']