HTML 解析 div.p.ol returns Python 中的空白
HTML parsing div.p.ol returns blank in Python
根据下面的屏幕截图,我试图将“ol”标签作为 header 标签,这样我就可以 运行 一个带有所有“li”标签的 for 循环获取“aria-label”的 content/addresses。
然而,我的代码returns空白。任何人都知道如何让这个“ol”充当 header?非常感谢!!
import requests
from bs4 import BeautifulSoup
# website
sitemap = 'https://www.walmart.com/store/finder?location=87321&distance=100'
# content of website
sitemap_content = requests.get(sitemap).content
# parsing website
soup = BeautifulSoup(sitemap_content, 'html.parser')
#print(soup)
header_div = soup.div.ol.li
print(header_div)
screenshot of inspect element
您在页面上看到的数据在 <script>
元素内存储为 Json。
import json
import requests
from bs4 import BeautifulSoup
url = 'https://www.walmart.com/store/finder?location=02468&distance=100'
soup = BeautifulSoup(requests.get(url).content, 'html.parser')
data = json.loads(soup.select_one('#storeFinder').string)
# uncomment this to print all data:
# print(json.dumps(data, indent=4))
# print some data to screen:
for store in data['storeFinder']['storeFinderCarousel']['stores']:
print(store['displayName'])
print(store['address']['address'])
print(store['address']['postalCode'], store['address']['city'])
print('-' * 80)
打印:
Framingham Store
121 Worcester Rd
01701 Framingham
--------------------------------------------------------------------------------
Walpole Supercenter
550 Providence Hwy
02081 Walpole
--------------------------------------------------------------------------------
Quincy Store
301 Falls Blvd
02169 Quincy
--------------------------------------------------------------------------------
...and so on.
根据下面的屏幕截图,我试图将“ol”标签作为 header 标签,这样我就可以 运行 一个带有所有“li”标签的 for 循环获取“aria-label”的 content/addresses。
然而,我的代码returns空白。任何人都知道如何让这个“ol”充当 header?非常感谢!!
import requests
from bs4 import BeautifulSoup
# website
sitemap = 'https://www.walmart.com/store/finder?location=87321&distance=100'
# content of website
sitemap_content = requests.get(sitemap).content
# parsing website
soup = BeautifulSoup(sitemap_content, 'html.parser')
#print(soup)
header_div = soup.div.ol.li
print(header_div)
screenshot of inspect element
您在页面上看到的数据在 <script>
元素内存储为 Json。
import json
import requests
from bs4 import BeautifulSoup
url = 'https://www.walmart.com/store/finder?location=02468&distance=100'
soup = BeautifulSoup(requests.get(url).content, 'html.parser')
data = json.loads(soup.select_one('#storeFinder').string)
# uncomment this to print all data:
# print(json.dumps(data, indent=4))
# print some data to screen:
for store in data['storeFinder']['storeFinderCarousel']['stores']:
print(store['displayName'])
print(store['address']['address'])
print(store['address']['postalCode'], store['address']['city'])
print('-' * 80)
打印:
Framingham Store
121 Worcester Rd
01701 Framingham
--------------------------------------------------------------------------------
Walpole Supercenter
550 Providence Hwy
02081 Walpole
--------------------------------------------------------------------------------
Quincy Store
301 Falls Blvd
02169 Quincy
--------------------------------------------------------------------------------
...and so on.