Dryscrape 会话无法加载任何站点
Dryscrape session can't load any site
我在 pythonanywhere.com 安装了 dryscrape。但是会话变量无法加载任何站点,为什么?
import dryscrape
# as in demo: http://dryscrape.readthedocs.io/en/latest/usage.html#first-demonstration
dryscrape.start_xvfb()
sess = dryscrape.Session()
sess.visit('https://www.pythonanywhere.com/')
结果错误:
sess.visit('https://www.pythonanywhere.com/')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/igorsavinkin/.local/lib/python3.5/site-packages/dryscrape/session.py", line 33, in visit
return self.driver.visit(self.complete_url(url))
File "/home/igorsavinkin/.local/lib/python3.5/site-packages/webkit_server.py", line 235, in visit
self.conn.issue_command("Visit", url)
File "/home/igorsavinkin/.local/lib/python3.5/site-packages/webkit_server.py", line 520, in issue_command
return self._read_response()
File "/home/igorsavinkin/.local/lib/python3.5/site-packages/webkit_server.py", line 530, in _read_response
raise InvalidResponseError(msg)
webkit_server.InvalidResponseError: {"class":"InvalidResponseError","message":"Unable to load URL: https://www.pythonanywhere.com/ because
of error loading https://www.pythonanywhere.com/: Unknown error"}
无论我从哪个站点进行会话访问whitelisted,问题都是一样的。
我读过有关 dryscrape 的文章 installation prerequisits:
Before installing dryscrape, you need to install some software it depends on:
- Qt, QtWebKit
- lxml
- pip
- xvfb_ (necessary only if no other X server is available)
因此,Qt
和 QtWebKit
都不在 pythoneverywhere 的默认模块中...
我尝试安装时,结果报错(QtWebKit
也一样)
$ pip install --user Qt
Collecting Qt
Could not find a version that satisfies the requirement Qt (from versions: )
No matching distribution found for Qt
dryscrape 安装文件,setup.py:
from distutils.core import setup, Command
setup(name='dryscrape',
version='0.9.1',
description='a lightweight Javascript-aware, headless web scraping library for Python',
author='Niklas Baumstark',
author_email='niklas.baumstark@gmail.com',
license='MIT',
url='https://niklasb.github.com/dryscrape',
packages=['dryscrape', 'dryscrape.driver'],
requires=['webkit_server', 'lxml'],
)
任何帮助都是可观的...
这里是 PythonAnywhere 开发者 -- 不幸的是 dryscrape 依赖于 WebKit,而 WebKit 不适用于我们的虚拟化系统。如果您需要使用可以处理 JavaScript 的浏览器进行网页抓取,您可以使用 selenium 和 Firefox——有 more information on our blog。不过请注意,我们只有 Firefox 版本 17——最近出现的问题与 WebKit 的问题相同。
我在 pythonanywhere.com 安装了 dryscrape。但是会话变量无法加载任何站点,为什么?
import dryscrape
# as in demo: http://dryscrape.readthedocs.io/en/latest/usage.html#first-demonstration
dryscrape.start_xvfb()
sess = dryscrape.Session()
sess.visit('https://www.pythonanywhere.com/')
结果错误:
sess.visit('https://www.pythonanywhere.com/')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/igorsavinkin/.local/lib/python3.5/site-packages/dryscrape/session.py", line 33, in visit
return self.driver.visit(self.complete_url(url))
File "/home/igorsavinkin/.local/lib/python3.5/site-packages/webkit_server.py", line 235, in visit
self.conn.issue_command("Visit", url)
File "/home/igorsavinkin/.local/lib/python3.5/site-packages/webkit_server.py", line 520, in issue_command
return self._read_response()
File "/home/igorsavinkin/.local/lib/python3.5/site-packages/webkit_server.py", line 530, in _read_response
raise InvalidResponseError(msg)
webkit_server.InvalidResponseError: {"class":"InvalidResponseError","message":"Unable to load URL: https://www.pythonanywhere.com/ because
of error loading https://www.pythonanywhere.com/: Unknown error"}
无论我从哪个站点进行会话访问whitelisted,问题都是一样的。
我读过有关 dryscrape 的文章 installation prerequisits:
Before installing dryscrape, you need to install some software it depends on:
- Qt, QtWebKit
- lxml
- pip
- xvfb_ (necessary only if no other X server is available)
因此,Qt
和 QtWebKit
都不在 pythoneverywhere 的默认模块中...
我尝试安装时,结果报错(QtWebKit
也一样)
$ pip install --user Qt
Collecting Qt
Could not find a version that satisfies the requirement Qt (from versions: )
No matching distribution found for Qt
dryscrape 安装文件,setup.py:
from distutils.core import setup, Command
setup(name='dryscrape',
version='0.9.1',
description='a lightweight Javascript-aware, headless web scraping library for Python',
author='Niklas Baumstark',
author_email='niklas.baumstark@gmail.com',
license='MIT',
url='https://niklasb.github.com/dryscrape',
packages=['dryscrape', 'dryscrape.driver'],
requires=['webkit_server', 'lxml'],
)
任何帮助都是可观的...
这里是 PythonAnywhere 开发者 -- 不幸的是 dryscrape 依赖于 WebKit,而 WebKit 不适用于我们的虚拟化系统。如果您需要使用可以处理 JavaScript 的浏览器进行网页抓取,您可以使用 selenium 和 Firefox——有 more information on our blog。不过请注意,我们只有 Firefox 版本 17——最近出现的问题与 WebKit 的问题相同。