NoSuchElementException:消息:没有这样的元素:使用 Selenium 和 Python 抓取前 20 个持有者时无法定位元素错误

NoSuchElementException: Message: no such element: Unable to locate element error while scraping the top 20 holder using Selenium and Python

我试图抓取 ERC-20 链上令牌的前 20 名持有者。我用那个硒。 xpath 似乎 load/didnt 没有足够的时间?

我尝试加载此页面: https://etherscan.io/token/0xdac17f958d2ee523a2206206994597c13d831ec7#balances

我试过隐式等待和显式等待。我什至可以看到,当我 运行 加载该端的 webdriver,但它从未找到路径...

明确等待的代码:

options = Options()
ptions.add_argument("--disable-dev-shm-using")
options.add_argument("--no-sandbox")
driver = webdriver.Chrome(chrome_options=options)
driver.get("https://etherscan.io/token/0xdac17f958d2ee523a2206206994597c13d831ec7#balances")
wait = WebDriverWait(driver, 10, poll_frequency=1)
wait.until(EC.visibility_of_element_located((By.XPATH, '//*[@id="maintable"]/div[3]/table/tbody/')))

错误:

selenium.common.exceptions.TimeoutException: Message:

是的,甚至没有消息...

隐式代码:

options = Options()
ptions.add_argument("--disable-dev-shm-using")
options.add_argument("--no-sandbox")
driver = webdriver.Chrome(chrome_options=options)
driver.implicitly_wait(10)
driver.get("https://etherscan.io/token/0xdac17f958d2ee523a2206206994597c13d831ec7#balances")
for i in range(1,20):
            req = driver.find_element_by_xpath('//*[@id="maintable"]/div[3]/table/tbody/tr['+str(i)+']/td[2]/span/a')
            

错误:

selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"xpath","selector":"//*[@id="maintable"]/div[3]/table/tbody/tr[1]/td[2]/span/a"}

所以就像我说的那样,驱动程序似乎没有足够的时间来加载页面,但即使有 20,30,...秒他们也找不到路径。

另外,当我从脚本打开的浏览器中复制 xpath 时,我可以找到 xpath。

Table 存在于 iframe 中,您需要先切换到 iframe 才能访问 table。

诱导WebDriverWait()并等待frame_to_be_available_and_switch_to_it()

诱导WebDriverWait()并等待visibility_of_all_elements_located()

代码:

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By

driver=webdriver.Chrome()
driver.get("https://etherscan.io/token/0xdac17f958d2ee523a2206206994597c13d831ec7#balances")
WebDriverWait(driver,10).until(EC.frame_to_be_available_and_switch_to_it((By.ID,"tokeholdersiframe")))
elements=WebDriverWait(driver,20).until(EC.visibility_of_all_elements_located((By.XPATH,'//*[@id="maintable"]/div[3]/table/tbody//tr/td[2]//a')))
for ele in elements:
    print(ele.get_attribute('href'))

如果你想获取前 20 个令牌然后使用这个。

for ele in elements[:20]:
    print(ele.get_attribute('href'))

要抓取 ERC-20 链上代币的前 20 名持有者,因为 持有者 信息在 <iframe> 内,因此您必须:

  • scrollIntoView 代币持有者图表

  • 诱导 WebDriverWait 以获得所需的 框架并切换到它

  • 为所需的 .

    引入 WebDriverWait
  • 您可以使用以下 based :

    driver.get("https://etherscan.io/token/0xdac17f958d2ee523a2206206994597c13d831ec7#balances")
    driver.execute_script("arguments[0].scrollIntoView(true);", WebDriverWait(driver, 60).until(EC.visibility_of_element_located((By.XPATH, "//div[@class='card']"))))
    WebDriverWait(driver,10).until(EC.frame_to_be_available_and_switch_to_it((By.XPATH,"//iframe[@id='tokeholdersiframe']")))
    print([my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//table[@class='table table-md-text-normal table-hover']//tbody//tr//td[./span]/span/a")))[:20]])
    
  • 注意:您必须添加以下导入:

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    
  • 控制台输出:

    ['0x7b8c69a0f660cd43ef67948976daae77bc6a019b', 'Binance 7', '0x5754284f345afc66a98fbb0a0afe71e0f007b949', 'Binance', 'Huobi 9', 'Bittrex 3', '0xd545f6eaf71b8e54af1f02dafba6c0d46c491cc1', '0x778476d4c51f93078d61e51c978f90b4a6e500af', 'Bitfinex 2', '0x5041ed759dd4afc3a72b8192c143f72f4724081a', '0xd30b438df65f4f788563b2b3611bd6059bff4ad9', '0x570aeda18a21d8fff6d28a5ef34164553cf9cb77', '0x2b9dc5aaf7b1c15f1fd8aba255919c2a7a184453', '0x6a5b1111a0b5ea8c7ec5665ba09cbacd7fde2b96', 'Gate.io 1', '0x9ec7d40d627ec59981446a6e5acb33d51afcaf8a', '0x231568baa78111377f097bb087241f8379fa18f4', '0xd33547964bae70e1ddd2863a4770dc5cffd86269', 'Huobi 3', 'Compound Tether']