无法使用 BeautifulSoup 抓取 div class
Unable to grab div class with BeautifulSoup
我正在尝试用 class 获取 div,但不知为何我做不到。有一个 id,但每个产品都不同。如何成功抓取 < div id="feli1062" class="row mqs-prop-inner-wrap with featt">?
这是我的代码;
from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup
my_url = "https://meqasa.com/apartments-for-sale-in-Accra"
uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()
page_soup = soup(page_html, "html.parser")
containers = page_soup.findAll("div", {"class": "row mqs-prop-inner-wrap with featt"})
print(len(containers))
您可以使用CSSselect或div[id^="feli"]
。这将 select 所有 <div>
个 id=
以“feli”开头的标签。
例如:
import requests
from bs4 import BeautifulSoup
url = 'https://meqasa.com/apartments-for-sale-in-Accra'
soup = BeautifulSoup(requests.get(url).content, 'html.parser')
for item in soup.select('div[id^="feli"]'):
print(item.h2.get_text(strip=True))
print('https://meqasa.com' + item.a['href'])
print('-' * 80)
打印:
3 bedroom apartment for sale at Community 25, Tema, Tema, Greater Accra Region
https://meqasa.com/3-bedroom-apartment-for-sale-in-Community 25, Tema, Tema, Greater Accra Region, Ghana-unit-1411
--------------------------------------------------------------------------------
2 bedroom apartment for sale at Sakumono
https://meqasa.com/2-bedroom-apartment-for-sale-in-Sakumono-unit-1385
--------------------------------------------------------------------------------
1 bedroom apartment for sale at East Legon
https://meqasa.com/1-bedroom-apartment-for-sale-in-East Legon-unit-1383
--------------------------------------------------------------------------------
2 bedroom apartment for sale at Accra
https://meqasa.com/2-bedroom-apartment-for-sale-in-Accra-unit-1408
--------------------------------------------------------------------------------
1 bedroom apartment for sale at Sakumono
https://meqasa.com/1-bedroom-apartment-for-sale-in-Sakumono-unit-1363
--------------------------------------------------------------------------------
... and so on.
我正在尝试用 class 获取 div,但不知为何我做不到。有一个 id,但每个产品都不同。如何成功抓取 < div id="feli1062" class="row mqs-prop-inner-wrap with featt">?
这是我的代码;
from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup
my_url = "https://meqasa.com/apartments-for-sale-in-Accra"
uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()
page_soup = soup(page_html, "html.parser")
containers = page_soup.findAll("div", {"class": "row mqs-prop-inner-wrap with featt"})
print(len(containers))
您可以使用CSSselect或div[id^="feli"]
。这将 select 所有 <div>
个 id=
以“feli”开头的标签。
例如:
import requests
from bs4 import BeautifulSoup
url = 'https://meqasa.com/apartments-for-sale-in-Accra'
soup = BeautifulSoup(requests.get(url).content, 'html.parser')
for item in soup.select('div[id^="feli"]'):
print(item.h2.get_text(strip=True))
print('https://meqasa.com' + item.a['href'])
print('-' * 80)
打印:
3 bedroom apartment for sale at Community 25, Tema, Tema, Greater Accra Region
https://meqasa.com/3-bedroom-apartment-for-sale-in-Community 25, Tema, Tema, Greater Accra Region, Ghana-unit-1411
--------------------------------------------------------------------------------
2 bedroom apartment for sale at Sakumono
https://meqasa.com/2-bedroom-apartment-for-sale-in-Sakumono-unit-1385
--------------------------------------------------------------------------------
1 bedroom apartment for sale at East Legon
https://meqasa.com/1-bedroom-apartment-for-sale-in-East Legon-unit-1383
--------------------------------------------------------------------------------
2 bedroom apartment for sale at Accra
https://meqasa.com/2-bedroom-apartment-for-sale-in-Accra-unit-1408
--------------------------------------------------------------------------------
1 bedroom apartment for sale at Sakumono
https://meqasa.com/1-bedroom-apartment-for-sale-in-Sakumono-unit-1363
--------------------------------------------------------------------------------
... and so on.