提交表单后检索 Mechanical Soup 结果

Question

我正在努力从简单的表单提交中检索一些结果。这是我目前所拥有的：

import mechanicalsoup

browser = mechanicalsoup.StatefulBrowser()
browser.set_verbose(2)
url = "https://www.dermcoll.edu.au/find-a-derm/"
browser.open(url)

form = browser.select_form("#find-derm-form")
browser["postcode"] = 3000
browser.submit_selected()

form.print_summary()

这些结果在哪里结束...？

非常感谢

Answer 1

根据 MechanicalSoup FAQ，在处理启用 JavaScript 的动态表单时不应使用此库，您示例中的网站似乎就是这种情况。

相反，您可以使用 Selenium in combination with BeautifulSoup (and a little bit of help from webdriver-manager) 来达到您想要的结果。一个简短的示例如下所示：

from selenium import webdriver
from bs4 import BeautifulSoup
from webdriver_manager.chrome import ChromeDriverManager

# set up the Chrome driver instance using webdriver_manager
driver = webdriver.Chrome(ChromeDriverManager().install())

# navigate to the page
driver.get("https://www.dermcoll.edu.au/find-a-derm/")

# find the postcode input and enter your desired value
postcode_input = driver.find_element_by_name("postcode")
postcode_input.send_keys("3000")

# find the search button and perform the search
search_button = driver.find_element_by_class_name("search-btn.location_derm_search_icon")
search_button.click()

# get all search results and load them into a BeautifulSoup object for parsing
search_results = driver.find_element_by_id("search_result")
search_results = search_results.get_attribute('innerHTML')
search_results = BeautifulSoup(search_results)

# get individual result cards
search_results = search_results.find_all("div", {"class": "address_sec_contents"})

# now you can parse for whatever information you need
[x.find("h4") for x in search_results]  # names
[x.find("p", {"class": "qualification"}) for x in search_results]  # qualifications
[x.find("address") for x in search_results]  # addresses

虽然这种方式可能看起来更复杂，但它更强大并且可以轻松地重新用于 MechanicalSoup 不足的更多情况。

提交表单后检索 Mechanical Soup 结果

Retrieve Mechanical Soup results after submitting a form

python

beautifulsoup

mechanicalsoup