如何在 bad html python selenium 中定位元素

How to locate an element within bad html python selenium

我想从此页面上抓取运动主管的信息。但问题是有一个强标签引用页面上每个人的姓名和电子邮件。我只想要一个专门提取运动主管的确切姓名和电子邮件的 XPath。这里是 link 到网站的代码,以便更好地理解。 "https://fhsaa.com/sports/2020/1/28/member_directory.aspx"

<div id="school_detail"><div class="row" align="center"><div class="bottom-spacing col-md-4"><button class="btn btn-primary btn-md" onclick="showAthleticFaculty(10)">Athletic Faculty</button></div><div class="bottom-spacing col-md-4"><button class="btn btn-primary btn-md" onclick="showCoachesAndSports(10)">Coaches &amp; Sports</button></div><div class="bottom-spacing col-md-4"><button class="btn btn-primary btn-md" onclick="showSchoolDetail(10)">School Information</button></div></div><br><h5 align="center"><u><strong>American (Hialeah) - Athletic Faculty</strong></u></h5><div><h6 class="athletic-faculty-header">Volunteer</h6></div><strong>N/A</strong><br><br><div><h6 class="athletic-faculty-header">Principal/Head Master</h6></div><strong>Name:</strong>  Stephen <br><strong>Email:</strong> <a href="mailto:StephenPapp@dadeschools.net">StephenPapp@dadeschools.net</a><br><br><div><h6 class="athletic-faculty-header">Athletic Director</h6></div><strong>Name:</strong>  Marcus Gabriel<br><strong>Email:</strong> <a href="mailto:mgabriel@dadeschools.net">mgabriel@dadeschools.net</a><br><br><div><h6 class="athletic-faculty-header">Assistant/Co AD</h6></div><strong>Name:</strong>  Ginette Torres<br><strong>Email:</strong> <a href="mailto:gtcastro@dadeschools.net">gtcastro@dadeschools.net</a><br><br><div><h6 class="athletic-faculty-header">Assistant/Vice Principal</h6></div><strong>Name:</strong>  Alex Gonzalez<br><strong>Email:</strong> <a href="mailto:agonza12@dadeschools.net">agonza12@dadeschools.net</a><br><br><div><h6 class="athletic-faculty-header">Administrative Assistant/Athletics</h6></div><strong>Name:</strong>  Shanell <br><strong>Email:</strong> <a href="mailto:slyoung005@gmail.com">slyoung005@gmail.com</a><br><br><div><h6 class="athletic-faculty-header">Financial/Bookkeeper Contact</h6></div><strong>Name:</strong>  Christopher Keighley<br><strong>Email:</strong> <a href="mailto:217441@dadeschools.net">217441@dadeschools.net</a><br><br><div><h6 class="athletic-faculty-header">Athletic Trainer</h6></div><strong>Name:</strong>  Gorin Aaron<br><strong>Email:</strong> <a href="mailto:333956@dadeschools.net">333956@dadeschools.net</a><br><br><div><h6 class="athletic-faculty-header">Medical - First Responder</h6></div><strong>N/A</strong><br><br></div>

要获取电子邮件 ID,请使用此 :-

//h6[text()='Athletic Director']/../following-sibling::strong[text()='Email:']/following-sibling::a

更新:

print(driver.find_element(By.XPATH, "//h6[text()='Athletic Director']/../following-sibling::strong[text()='Email:']/following-sibling::a").text)

更新 1:

elem = driver.find_element(By.XPATH, "//h6[text()='Athletic Director']/../following-sibling::strong[text()='Name:']")
name = driver.execute_script("return arguments[0].nextSibling.textContent;", elem)
print(name)