如何让 Selenium 不等到整个页面加载,脚本很慢?
How to make Selenium not wait till full page load, which has a slow script?
Selenium driver.get (url)
等待整个页面加载。但是一个抓取页面试图加载一些死的 JS 脚本。所以我的 Python 脚本等待它并且在几分钟内不起作用。这个问题可能出现在网站的每个页面上。
from selenium import webdriver
driver = webdriver.Chrome()
driver.get('https://www.cortinadecor.com/productos/17/estores-enrollables-screen/estores-screen-corti-3000')
# It try load: https://www.cetelem.es/eCommerceCalculadora/resources/js/eCalculadoraCetelemCombo.js
driver.find_element_by_name('ANCHO').send_keys("100")
如何限制等待时间,阻止AJAX加载文件,或者其他方式?
我也在 webdriver.Chrome()
中测试我的脚本,但会使用 PhantomJS(),或者可能使用 Firefox()。所以,如果某些方法使用浏览器设置的更改,那么它一定是通用的。
当 Selenium 默认加载 page/url 时,它遵循默认配置 pageLoadStrategy
设置为 normal
。为了使 Selenium 不等待整页加载,我们可以配置 pageLoadStrategy
。 pageLoadStrategy
支持3种不同的值如下:
normal
(整页加载)
eager
(互动)
none
这是配置 pageLoadStrategy
的代码块:
Firefox:
from selenium import webdriver
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
caps = DesiredCapabilities().FIREFOX
caps["pageLoadStrategy"] = "normal" # complete
#caps["pageLoadStrategy"] = "eager" # interactive
#caps["pageLoadStrategy"] = "none"
driver = webdriver.Firefox(desired_capabilities=caps, executable_path=r'C:\path\to\geckodriver.exe')
driver.get("http://google.com")
Chrome :
from selenium import webdriver
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
caps = DesiredCapabilities().CHROME
caps["pageLoadStrategy"] = "normal" # complete
#caps["pageLoadStrategy"] = "eager" # interactive
#caps["pageLoadStrategy"] = "none"
driver = webdriver.Chrome(desired_capabilities=caps, executable_path=r'C:\path\to\chromedriver.exe')
driver.get("http://google.com")
Note : pageLoadStrategy
values normal
, eager
and none
is a requirement as per WebDriver W3C Editor's Draft but pageLoadStrategy
value as eager
is still a WIP (Work In Progress) within ChromeDriver implementation. You can find a detailed discussion in
@undetected Selenium 答案效果很好,但对于 chrome,部分不起作用使用下面的答案 chrome
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
capa = DesiredCapabilities.CHROME
capa["pageLoadStrategy"] = "none"
browser= webdriver.Chrome(desired_capabilities=capa,executable_path='PATH',options=options)
Selenium driver.get (url)
等待整个页面加载。但是一个抓取页面试图加载一些死的 JS 脚本。所以我的 Python 脚本等待它并且在几分钟内不起作用。这个问题可能出现在网站的每个页面上。
from selenium import webdriver
driver = webdriver.Chrome()
driver.get('https://www.cortinadecor.com/productos/17/estores-enrollables-screen/estores-screen-corti-3000')
# It try load: https://www.cetelem.es/eCommerceCalculadora/resources/js/eCalculadoraCetelemCombo.js
driver.find_element_by_name('ANCHO').send_keys("100")
如何限制等待时间,阻止AJAX加载文件,或者其他方式?
我也在 webdriver.Chrome()
中测试我的脚本,但会使用 PhantomJS(),或者可能使用 Firefox()。所以,如果某些方法使用浏览器设置的更改,那么它一定是通用的。
当 Selenium 默认加载 page/url 时,它遵循默认配置 pageLoadStrategy
设置为 normal
。为了使 Selenium 不等待整页加载,我们可以配置 pageLoadStrategy
。 pageLoadStrategy
支持3种不同的值如下:
normal
(整页加载)eager
(互动)none
这是配置 pageLoadStrategy
的代码块:
Firefox:
from selenium import webdriver from selenium.webdriver.common.desired_capabilities import DesiredCapabilities caps = DesiredCapabilities().FIREFOX caps["pageLoadStrategy"] = "normal" # complete #caps["pageLoadStrategy"] = "eager" # interactive #caps["pageLoadStrategy"] = "none" driver = webdriver.Firefox(desired_capabilities=caps, executable_path=r'C:\path\to\geckodriver.exe') driver.get("http://google.com")
Chrome :
from selenium import webdriver from selenium.webdriver.common.desired_capabilities import DesiredCapabilities caps = DesiredCapabilities().CHROME caps["pageLoadStrategy"] = "normal" # complete #caps["pageLoadStrategy"] = "eager" # interactive #caps["pageLoadStrategy"] = "none" driver = webdriver.Chrome(desired_capabilities=caps, executable_path=r'C:\path\to\chromedriver.exe') driver.get("http://google.com")
Note :
pageLoadStrategy
valuesnormal
,eager
andnone
is a requirement as per WebDriver W3C Editor's Draft butpageLoadStrategy
value aseager
is still a WIP (Work In Progress) within ChromeDriver implementation. You can find a detailed discussion in
@undetected Selenium 答案效果很好,但对于 chrome,部分不起作用使用下面的答案 chrome
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
capa = DesiredCapabilities.CHROME
capa["pageLoadStrategy"] = "none"
browser= webdriver.Chrome(desired_capabilities=capa,executable_path='PATH',options=options)