Selenium python 无法抓取网站

Selenium python can't scrape a site

我需要抓取网站,但显示“访问前检查您的浏览器”并阻止访问该网站

我必须定义 cookie 还是有其他解决方案?

from selenium import webdriver
from time import sleep

options = webdriver.ChromeOptions()
options.add_argument("--no-sandbox")
options.add_argument("--window-size=1920,1080")
options.add_argument("user-agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:95.0) Gecko/20100101 Firefox/95.0")
mainbrowser = webdriver.Chrome(chrome_options=options)

mainbrowser.get('https://trade.kraken.com/charts/KRAKEN:BTC-USDT')
sleep(20)

我最近使用了以下选项来避免某些网站上的验证码检测:

options = webdriver.ChromeOptions()
options.add_argument("start-maximized") 
options.add_argument("./chrome_data") # Chrome Profile data (moved from ~/Library/Application Support/Google/Chrome)
options.add_argument("--user-data-dir=chrome-data") 
options.add_experimental_option("excludeSwitches", ["enable-automation"]) 
options.add_experimental_option('useAutomationExtension', False)

此外,我还使用了库 selenium-stealth (https://pypi.org/project/selenium-stealth/),它已将许多用于避免检测的技术合并到一个包中:

driver = webdriver.Chrome(options=options)    
    
stealth(
        driver,
        user_agent = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.53 Safari/537.36',
        languages = ["en-US", "en"],
        vendor = "Google Inc.",
        platform = "Win32",
        webgl_vendor = "Intel Inc.",
        renderer = "Intel Iris OpenGL Engine",
        fix_hairline = True,
        run_on_insecure_origins = True)