我如何使用 Selenium 等待元素在页面上可见(但随后转到其他内容)?
How can I use Selenium to wait for an element to be visible on a page (but then move on to something else)?
我正在尝试从网页中抓取 URL,它们位于排名中 table,需要几秒钟才能加载。
我想做的是等到排名 table 完成加载,然后通过它的 id 获取它并迭代元素。
这是我用来抓取页面并等待的代码:
driver = webdriver.Chrome(cred_path)
driver.get(page)
wait(driver, 5).until(EC.presence_of_element_located((By.ID, 'sc-ljMRFG hgfcNB rankings-table')))
#soup = BeautifulSoup(driver.page_source, features='lxml')
#print(soup.prettify())
rankings = soup.find_all('div', {'class': "sc-ljMRFG hgfcNB rankings-table"})[0]
print(rankings)
据我所知代码实际上正在运行到那个点(当 window 打开时我可以直观地看到 table 加载),但随后它抛出超时错误:
Traceback (most recent call last):
File "ethereum_scraper_dappRadarv2.py", line 377, in <module>
general_dapp_page()
File "ethereum_scraper_dappRadarv2.py", line 39, in general_dapp_page
_ = wait(driver, 5).until(EC.visibility_of_element_located((By.ID, 'sc-ljMRFG hgfcNB rankings-table')))
File "/Users/trentfowler/opt/anaconda3/lib/python3.8/site-packages/selenium/webdriver/support/wait.py", line 89, in until
raise TimeoutException(message, screen, stacktrace)
selenium.common.exceptions.TimeoutException: Message:
Stacktrace:
0 chromedriver 0x0000000104dd4269 __gxx_personality_v0 + 582729
1 chromedriver 0x0000000104d5fc33 __gxx_personality_v0 + 106003
2 chromedriver 0x000000010491ce28 chromedriver + 171560
3 chromedriver 0x00000001049523d2 chromedriver + 390098
4 chromedriver 0x0000000104952591 chromedriver + 390545
5 chromedriver 0x00000001049846b4 chromedriver + 595636
6 chromedriver 0x000000010496f9fd chromedriver + 510461
7 chromedriver 0x0000000104982462 chromedriver + 586850
8 chromedriver 0x000000010496fc23 chromedriver + 511011
9 chromedriver 0x000000010494575e chromedriver + 337758
10 chromedriver 0x0000000104946a95 chromedriver + 342677
11 chromedriver 0x0000000104d908ab __gxx_personality_v0 + 305803
12 chromedriver 0x0000000104da7863 __gxx_personality_v0 + 399939
13 chromedriver 0x0000000104dacc7f __gxx_personality_v0 + 421471
14 chromedriver 0x0000000104da8bba __gxx_personality_v0 + 404890
15 chromedriver 0x0000000104d84e51 __gxx_personality_v0 + 258097
16 chromedriver 0x0000000104dc4158 __gxx_personality_v0 + 516920
17 chromedriver 0x0000000104dc42e1 __gxx_personality_v0 + 517313
18 chromedriver 0x0000000104ddb6f8 __gxx_personality_v0 + 612568
19 libsystem_pthread.dylib 0x00007fff205d18fc _pthread_start + 224
20 libsystem_pthread.dylib 0x00007fff205cd443 thread_start + 15
(请注意,据我所知,后续的 rankings =
和 print
语句未执行)
我目前的解释是 selenium 正在执行等待命令,但随后超时,因为没有直接给它的进一步指令(即我没有在任何事情上调用 click()
)。
我有 RTFM,但 selenium 文档非常稀少。真的没有等待元素加载然后继续执行其他处理任务的概念吗?我是否有以某种方式与元素交互,如果是,最好的交互方式是什么,因为我真正想要的是遍历内部元素?
可能您使用了错误的定位器,因为 sc-ljMRFG hgfcNB rankings-table
不能是 ID
[=34= 的值] 属性,但可能是 class
属性的值。
如此有效你需要改变:
wait(driver, 5).until(EC.presence_of_element_located((By.ID, 'sc-ljMRFG hgfcNB rankings-table')))
诱导WebDriverWait for the and you can use either of the following :
使用CLASS_NAME:
wait(driver, 5).until(EC.visibility_of_element_located((By.CLASS_NAME, 'rankings-table')))
使用CSS_SELECTOR:
wait(driver, 5).until(EC.visibility_of_element_located((By.CSS_SELECTOR, '.sc-ljMRFG.hgfcNB.rankings-table')))
我正在尝试从网页中抓取 URL,它们位于排名中 table,需要几秒钟才能加载。
我想做的是等到排名 table 完成加载,然后通过它的 id 获取它并迭代元素。
这是我用来抓取页面并等待的代码:
driver = webdriver.Chrome(cred_path)
driver.get(page)
wait(driver, 5).until(EC.presence_of_element_located((By.ID, 'sc-ljMRFG hgfcNB rankings-table')))
#soup = BeautifulSoup(driver.page_source, features='lxml')
#print(soup.prettify())
rankings = soup.find_all('div', {'class': "sc-ljMRFG hgfcNB rankings-table"})[0]
print(rankings)
据我所知代码实际上正在运行到那个点(当 window 打开时我可以直观地看到 table 加载),但随后它抛出超时错误:
Traceback (most recent call last):
File "ethereum_scraper_dappRadarv2.py", line 377, in <module>
general_dapp_page()
File "ethereum_scraper_dappRadarv2.py", line 39, in general_dapp_page
_ = wait(driver, 5).until(EC.visibility_of_element_located((By.ID, 'sc-ljMRFG hgfcNB rankings-table')))
File "/Users/trentfowler/opt/anaconda3/lib/python3.8/site-packages/selenium/webdriver/support/wait.py", line 89, in until
raise TimeoutException(message, screen, stacktrace)
selenium.common.exceptions.TimeoutException: Message:
Stacktrace:
0 chromedriver 0x0000000104dd4269 __gxx_personality_v0 + 582729
1 chromedriver 0x0000000104d5fc33 __gxx_personality_v0 + 106003
2 chromedriver 0x000000010491ce28 chromedriver + 171560
3 chromedriver 0x00000001049523d2 chromedriver + 390098
4 chromedriver 0x0000000104952591 chromedriver + 390545
5 chromedriver 0x00000001049846b4 chromedriver + 595636
6 chromedriver 0x000000010496f9fd chromedriver + 510461
7 chromedriver 0x0000000104982462 chromedriver + 586850
8 chromedriver 0x000000010496fc23 chromedriver + 511011
9 chromedriver 0x000000010494575e chromedriver + 337758
10 chromedriver 0x0000000104946a95 chromedriver + 342677
11 chromedriver 0x0000000104d908ab __gxx_personality_v0 + 305803
12 chromedriver 0x0000000104da7863 __gxx_personality_v0 + 399939
13 chromedriver 0x0000000104dacc7f __gxx_personality_v0 + 421471
14 chromedriver 0x0000000104da8bba __gxx_personality_v0 + 404890
15 chromedriver 0x0000000104d84e51 __gxx_personality_v0 + 258097
16 chromedriver 0x0000000104dc4158 __gxx_personality_v0 + 516920
17 chromedriver 0x0000000104dc42e1 __gxx_personality_v0 + 517313
18 chromedriver 0x0000000104ddb6f8 __gxx_personality_v0 + 612568
19 libsystem_pthread.dylib 0x00007fff205d18fc _pthread_start + 224
20 libsystem_pthread.dylib 0x00007fff205cd443 thread_start + 15
(请注意,据我所知,后续的 rankings =
和 print
语句未执行)
我目前的解释是 selenium 正在执行等待命令,但随后超时,因为没有直接给它的进一步指令(即我没有在任何事情上调用 click()
)。
我有 RTFM,但 selenium 文档非常稀少。真的没有等待元素加载然后继续执行其他处理任务的概念吗?我是否有以某种方式与元素交互,如果是,最好的交互方式是什么,因为我真正想要的是遍历内部元素?
可能您使用了错误的定位器,因为 sc-ljMRFG hgfcNB rankings-table
不能是 ID
[=34= 的值] 属性,但可能是 class
属性的值。
如此有效你需要改变:
wait(driver, 5).until(EC.presence_of_element_located((By.ID, 'sc-ljMRFG hgfcNB rankings-table')))
诱导WebDriverWait for the
使用CLASS_NAME:
wait(driver, 5).until(EC.visibility_of_element_located((By.CLASS_NAME, 'rankings-table')))
使用CSS_SELECTOR:
wait(driver, 5).until(EC.visibility_of_element_located((By.CSS_SELECTOR, '.sc-ljMRFG.hgfcNB.rankings-table')))