PhantomJS 使用 HTTPS 返回空白页面

Question

使用 phantomjs selenium beautifulsoup 设置来打印页面源代码，但在 https 上只有 returns 空白 html。 Returns http 上的页面源。读了一个material的rake，比如this and this，但是没有结果。

from selenium import webdriver
import urllib.request as urllib2
import requests
import urllibh
from bs4 import BeautifulSoup
import csv
import time

browser = webdriver.PhantomJS(service_args=['--ignore-ssl-errors=true', '--ssl-protocol=any'])
browser.get('https://google.com')
browser.set_window_size(2000, 1500)

soup = BeautifulSoup(browser.page_source, "html.parser")

print(soup)

browser.quit()

结果

<html><head></head><body></body></html>
Complete

Answer 1

browser = webdriver.PhantomJS(service_args=['--ignore-ssl-errors=true', '--ssl-client-certificate-file=C:\tmp\clientcert.cer', '--ssl-client-key-file=C:\tmp\clientcert.key', '--ssl-client-key-passphrase=1111'])

必须将 SSL 证书指向本地文件。

PhantomJS 使用 HTTPS 返回空白页面

PhantomJS returning blank page with HTTPS

selenium

web-scraping

phantomjs

selenium-webdriver