Python Selenium Webscraping:find_elements_by_xpath 返回空列表

Python Selenium Webscraping: find_elements_by_xpath returning an empty list

我在大学学习了一些编码科目,并尝试通过学习 selenium 来分析网球统计数据,这对我来说是全新的。

我正在使用的页面在这里 (https://www.atptour.com/en/scores/results-archive?year=2021) and I'm followinig a guide from this website here (https://www.scrapingbee.com/blog/selenium-python/ , https://www.scrapingbee.com/blog/practical-xpath-for-web-scraping/)。我遇到的特定问题是在副标题“E-commerce 产品数据提取”下的第二个指南网站中。

我的目标是遍历锦标赛并提取 'Results' 按钮所在的链接,但我遇到了麻烦,因为我的程序只给我一个空列表。

from selenium import webdriver
from selenium.webdriver.chrome.options import Options


DRIVER_PATH = "C:\Program Files (x86)\chromedriver.exe"
#driver = webdriver.Chrome(executable_path=DRIVER_PATH)
options = Options()
options.headless = True
options.add_argument("--window-size=1920,1200")
driver = webdriver.Chrome(options=options, executable_path=DRIVER_PATH)
#driver.get("https://www.nintendo.com/")
#print(driver.page_source)
#driver.quit()
# 1 Data Collection
# 1.1 Find Links to All Tournaments
tournaments_2021_url = "https://www.atptour.com/en/scores/results-archive?year=2021"
#tournament_class = "tourney-result"
driver.get(tournaments_2021_url) # print(driver.page_source)
tournaments_2021_url_list = driver.find_elements_by_xpath("//a[@class='button-border']")
print("\n tournament urls \n")
print(tournaments_2021_url_list)
print(len(tournaments_2021_url_list))
driver.quit()
# 1.2 For Each Tournament, Find Links to Each Match
# 1.3 For Each Match, Extract Relevant Statistics

我希望有一个元素列表或一些奇怪的 objects 并能够提取链接,但我得到的是一个 len 为 0 的空列表。感谢您的帮助。

要打印所有 RESULTShref 属性的值,您需要诱导 for the visibility_of_all_elements_located() and you can use either of the following :

  • 使用PARTIAL_LINK_TEXT:

    driver.get("https://www.atptour.com/en/scores/results-archive?year=2021")
    print([my_elem.get_attribute("href") for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.PARTIAL_LINK_TEXT, "Results")))])
    driver.quit()
    
  • 使用CSS_SELECTOR:

    driver.get("https://www.atptour.com/en/scores/results-archive?year=2021")
    print([my_elem.get_attribute("href") for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "a[href$='results']")))])
    driver.quit()
    
  • 使用 XPATH:

    driver.get("https://www.atptour.com/en/scores/results-archive?year=2021")
    print([my_elem.get_attribute("href") for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//a[normalize-space()='Results']")))])
    driver.quit()
    
  • 控制台输出:

    ['https://www.atptour.com/en/scores/archive/delray-beach/499/2021/results', 'https://www.atptour.com/en/scores/archive/antalya/9426/2021/results', 'https://www.atptour.com/en/scores/archive/auckland/301/2021/results', 'https://www.atptour.com/en/scores/archive/melbourne/8998/2021/results', 'https://www.atptour.com/en/scores/archive/melbourne/9428/2021/results', 'https://www.atptour.com/en/scores/archive/pune/891/2021/results', 'https://www.atptour.com/en/scores/archive/atp-cup/8888/2021/results', 'https://www.atptour.com/en/scores/archive/australian-open/580/2021/results', 'https://www.atptour.com/en/scores/archive/new-york/424/2021/results', 'https://www.atptour.com/en/scores/archive/rio-de-janeiro/6932/2021/results', 'https://www.atptour.com/en/scores/archive/singapore/9460/2021/results', 'https://www.atptour.com/en/scores/archive/cordoba/9158/2021/results', 'https://www.atptour.com/en/scores/archive/montpellier/375/2021/results', 'https://www.atptour.com/en/scores/archive/rotterdam/407/2021/results', 'https://www.atptour.com/en/scores/archive/buenos-aires/506/2021/results', 'https://www.atptour.com/en/scores/archive/doha/451/2021/results', 'https://www.atptour.com/en/scores/archive/marseille/496/2021/results', 'https://www.atptour.com/en/scores/archive/santiago/8996/2021/results', 'https://www.atptour.com/en/scores/archive/dubai/495/2021/results', 'https://www.atptour.com/en/scores/archive/acapulco/807/2021/results', 'https://www.atptour.com/en/scores/archive/miami/403/2021/results', 'https://www.atptour.com/en/scores/archive/marrakech/360/2021/results', 'https://www.atptour.com/en/scores/archive/cagliari/9481/2021/results', 'https://www.atptour.com/en/scores/archive/marbella/9462/2021/results', 'https://www.atptour.com/en/scores/archive/houston/717/2021/results', 'https://www.atptour.com/en/scores/archive/monte-carlo/410/2021/results', 'https://www.atptour.com/en/scores/archive/barcelona/425/2021/results', 'https://www.atptour.com/en/scores/archive/belgrade/5053/2021/results', 'https://www.atptour.com/en/scores/archive/estoril/7290/2021/results', 'https://www.atptour.com/en/scores/archive/munich/308/2021/results', 'https://www.atptour.com/en/scores/archive/madrid/1536/2021/results', 'https://www.atptour.com/en/scores/archive/rome/416/2021/results', 'https://www.atptour.com/en/scores/archive/geneva/322/2021/results', 'https://www.atptour.com/en/scores/archive/lyon/7694/2021/results', 'https://www.atptour.com/en/scores/archive/parma/9510/2021/results', 'https://www.atptour.com/en/scores/archive/belgrade/9512/2021/results', 'https://www.atptour.com/en/scores/archive/roland-garros/520/2021/results', 'https://www.atptour.com/en/scores/archive/s-hertogenbosch/440/2021/results', 'https://www.atptour.com/en/scores/archive/stuttgart/321/2021/results', 'https://www.atptour.com/en/scores/archive/halle/500/2021/results', 'https://www.atptour.com/en/scores/archive/london/311/2021/results', 'https://www.atptour.com/en/scores/archive/mallorca/8994/2021/results', 'https://www.atptour.com/en/scores/archive/eastbourne/741/2021/results', 'https://www.atptour.com/en/scores/archive/wimbledon/540/2021/results', 'https://www.atptour.com/en/scores/archive/hamburg/414/2021/results', 'https://www.atptour.com/en/scores/archive/newport/315/2021/results', 'https://www.atptour.com/en/scores/archive/bastad/316/2021/results', 'https://www.atptour.com/en/scores/archive/los-cabos/7480/2021/results', 'https://www.atptour.com/en/scores/archive/gstaad/314/2021/results', 'https://www.atptour.com/en/scores/archive/umag/439/2021/results', 'https://www.atptour.com/en/scores/archive/tokyo/96/2021/results', 'https://www.atptour.com/en/scores/archive/atlanta/6116/2021/results', 'https://www.atptour.com/en/scores/archive/kitzbuhel/319/2021/results', 'https://www.atptour.com/en/scores/archive/washington/418/2021/results', 'https://www.atptour.com/en/scores/archive/toronto/421/2021/results', 'https://www.atptour.com/en/scores/archive/cincinnati/422/2021/results', 'https://www.atptour.com/en/scores/archive/winston-salem/6242/2021/results', 'https://www.atptour.com/en/scores/archive/us-open/560/2021/results', 'https://www.atptour.com/en/scores/archive/nur-sultan/9410/2021/results', 'https://www.atptour.com/en/scores/archive/metz/341/2021/results', 'https://www.atptour.com/en/scores/archive/laver-cup/9210/2021/results', 'https://www.atptour.com/en/scores/archive/san-diego/9569/2021/results', 'https://www.atptour.com/en/scores/archive/sofia/7434/2021/results', 'https://www.atptour.com/en/scores/archive/chengdu/7581/2021/results', 'https://www.atptour.com/en/scores/archive/zhuhai/9164/2021/results', 'https://www.atptour.com/en/scores/archive/shanghai/5014/2021/results', 'https://www.atptour.com/en/scores/archive/beijing/747/2021/results', 'https://www.atptour.com/en/scores/archive/tokyo/329/2021/results', 'https://www.atptour.com/en/scores/archive/indian-wells/404/2021/results', 'https://www.atptour.com/en/scores/archive/moscow/438/2021/results', 'https://www.atptour.com/en/scores/archive/antwerp/7485/2021/results', 'https://www.atptour.com/en/scores/archive/vienna/337/2021/results', 'https://www.atptour.com/en/scores/archive/st-petersburg/568/2021/results', 'https://www.atptour.com/en/scores/archive/basel/328/2021/results', 'https://www.atptour.com/en/scores/archive/paris/352/2021/results', 'https://www.atptour.com/en/scores/archive/stockholm/429/2021/results', 'https://www.atptour.com/en/scores/archive/intesa-sanpaolo-next-gen-atp-finals/7696/2021/results', 'https://www.atptour.com/en/scores/archive/nitto-atp-finals/605/2021/results']
    
  • 注意:您必须添加以下导入:

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    

我拿了你的代码 运行 它很好。它做了它应该做的。 因此,我的建议是 运行 它通过调试器并单步执行以确保一切按预期进行。也删除无头选项,以便您可以目视确认。 检查您的 chrome 浏览器版本并确保它与您正在使用的 chrome 驱动程序匹配。 (尽管如果版本不匹配,它应该会给你一条错误消息。) 最后,如果一切都失败了,请尝试使用其他浏览器,例如 firefox,以及适当的 geckodriver。

这是要添加到您的基本代码中的更新代码:

from selenium.webdriver.common.by import By
tournaments_2021_url = "https://www.atptour.com/en/scores/results-archive?year=2021"
self.driver.get(tournaments_2021_url)
tournaments_2021_url_list = self.driver.find_elements(By.XPATH, "//a[@class='button-border']")
print("\nTournament URLs:\n")
for row in tournaments_2021_url_list:
    print(row.get_attribute("href"))
print("\nNumber of rows:")
print(len(tournaments_2021_url_list))

这是 运行 一切之后的输出:

Tournament URLs:

https://www.atptour.com/en/scores/archive/delray-beach/499/2021/results
https://www.atptour.com/en/scores/archive/antalya/9426/2021/results
https://www.atptour.com/en/scores/archive/auckland/301/2021/results
https://www.atptour.com/en/scores/archive/melbourne/8998/2021/results
https://www.atptour.com/en/scores/archive/melbourne/9428/2021/results
https://www.atptour.com/en/scores/archive/pune/891/2021/results
https://www.atptour.com/en/scores/archive/atp-cup/8888/2021/results
https://www.atptour.com/en/scores/archive/australian-open/580/2021/results
https://www.atptour.com/en/scores/archive/new-york/424/2021/results
https://www.atptour.com/en/scores/archive/rio-de-janeiro/6932/2021/results
https://www.atptour.com/en/scores/archive/singapore/9460/2021/results
https://www.atptour.com/en/scores/archive/cordoba/9158/2021/results
https://www.atptour.com/en/scores/archive/montpellier/375/2021/results
https://www.atptour.com/en/scores/archive/rotterdam/407/2021/results
https://www.atptour.com/en/scores/archive/buenos-aires/506/2021/results
https://www.atptour.com/en/scores/archive/doha/451/2021/results
https://www.atptour.com/en/scores/archive/marseille/496/2021/results
https://www.atptour.com/en/scores/archive/santiago/8996/2021/results
https://www.atptour.com/en/scores/archive/dubai/495/2021/results
https://www.atptour.com/en/scores/archive/acapulco/807/2021/results
https://www.atptour.com/en/scores/archive/miami/403/2021/results
https://www.atptour.com/en/scores/archive/marrakech/360/2021/results
https://www.atptour.com/en/scores/archive/cagliari/9481/2021/results
https://www.atptour.com/en/scores/archive/marbella/9462/2021/results
https://www.atptour.com/en/scores/archive/houston/717/2021/results
https://www.atptour.com/en/scores/archive/monte-carlo/410/2021/results
https://www.atptour.com/en/scores/archive/barcelona/425/2021/results
https://www.atptour.com/en/scores/archive/belgrade/5053/2021/results
https://www.atptour.com/en/scores/archive/estoril/7290/2021/results
https://www.atptour.com/en/scores/archive/munich/308/2021/results
https://www.atptour.com/en/scores/archive/madrid/1536/2021/results
https://www.atptour.com/en/scores/archive/rome/416/2021/results
https://www.atptour.com/en/scores/archive/geneva/322/2021/results
https://www.atptour.com/en/scores/archive/lyon/7694/2021/results
https://www.atptour.com/en/scores/archive/parma/9510/2021/results
https://www.atptour.com/en/scores/archive/belgrade/9512/2021/results
https://www.atptour.com/en/scores/archive/roland-garros/520/2021/results
https://www.atptour.com/en/scores/archive/s-hertogenbosch/440/2021/results
https://www.atptour.com/en/scores/archive/stuttgart/321/2021/results
https://www.atptour.com/en/scores/archive/halle/500/2021/results
https://www.atptour.com/en/scores/archive/london/311/2021/results
https://www.atptour.com/en/scores/archive/mallorca/8994/2021/results
https://www.atptour.com/en/scores/archive/eastbourne/741/2021/results
https://www.atptour.com/en/scores/archive/wimbledon/540/2021/results
https://www.atptour.com/en/scores/archive/hamburg/414/2021/results
https://www.atptour.com/en/scores/archive/newport/315/2021/results
https://www.atptour.com/en/scores/archive/bastad/316/2021/results
https://www.atptour.com/en/scores/archive/los-cabos/7480/2021/results
https://www.atptour.com/en/scores/archive/gstaad/314/2021/results
https://www.atptour.com/en/scores/archive/umag/439/2021/results
https://www.atptour.com/en/scores/archive/tokyo/96/2021/results
https://www.atptour.com/en/scores/archive/atlanta/6116/2021/results
https://www.atptour.com/en/scores/archive/kitzbuhel/319/2021/results
https://www.atptour.com/en/scores/archive/washington/418/2021/results
https://www.atptour.com/en/scores/archive/toronto/421/2021/results
https://www.atptour.com/en/scores/archive/cincinnati/422/2021/results
https://www.atptour.com/en/scores/archive/winston-salem/6242/2021/results
https://www.atptour.com/en/scores/archive/us-open/560/2021/results
https://www.atptour.com/en/scores/archive/nur-sultan/9410/2021/results
https://www.atptour.com/en/scores/archive/metz/341/2021/results
https://www.atptour.com/en/scores/archive/laver-cup/9210/2021/results
https://www.atptour.com/en/scores/archive/san-diego/9569/2021/results
https://www.atptour.com/en/scores/archive/sofia/7434/2021/results
https://www.atptour.com/en/scores/archive/chengdu/7581/2021/results
https://www.atptour.com/en/scores/archive/zhuhai/9164/2021/results
https://www.atptour.com/en/scores/archive/shanghai/5014/2021/results
https://www.atptour.com/en/scores/archive/beijing/747/2021/results
https://www.atptour.com/en/scores/archive/tokyo/329/2021/results
https://www.atptour.com/en/scores/archive/indian-wells/404/2021/results
https://www.atptour.com/en/scores/archive/moscow/438/2021/results
https://www.atptour.com/en/scores/archive/antwerp/7485/2021/results
https://www.atptour.com/en/scores/archive/vienna/337/2021/results
https://www.atptour.com/en/scores/archive/st-petersburg/568/2021/results
https://www.atptour.com/en/scores/archive/basel/328/2021/results
https://www.atptour.com/en/scores/archive/paris/352/2021/results
https://www.atptour.com/en/scores/archive/stockholm/429/2021/results
https://www.atptour.com/en/scores/archive/intesa-sanpaolo-next-gen-atp-finals/7696/2021/results
https://www.atptour.com/en/scores/archive/nitto-atp-finals/605/2021/results

Number of rows:
78

这还有避免弃用警告的额外好处。 driver.find_elements_by_xpath 是“已弃用”,它会在 运行 pytest 之后显示一条警告消息。较新的 driver.find_elements(By.XPATH, XPATH) 避免了这种情况,尽管它确实在代码中添加了一个额外的导入行 from selenium.webdriver.common.by import By