如何制作干刮 session?
How do I make a dryscrape session?
我正在尝试在 Mac 上进行干刮 session。我正在尝试 运行 的代码如下:
import dryscrape
session = dryscrape.Session(base_url = 'http://google.com')
但是当我 运行 它时,我得到了这个权限错误:
Traceback (most recent call last):
File "<ipython-input-37-5e3204f25ebb>", line 3, in <module>
session = dryscrape.Session(base_url = 'http://google.com')
File "/Users/MyName/anaconda/lib/python3.5/site-packages/dryscrape/session.py", line 22, in __init__
self.driver = driver or DefaultDriver()
File "/Users/MyName/anaconda/lib/python3.5/site-packages/dryscrape/driver/webkit.py", line 30, in __init__
super(Driver, self).__init__(**kw)
File "/Users/MyName/anaconda/lib/python3.5/site-packages/webkit_server.py", line 230, in __init__
self.conn = connection or ServerConnection()
File "/Users/MyName/anaconda/lib/python3.5/site-packages/webkit_server.py", line 507, in __init__
self._sock = (server or get_default_server()).connect()
File "/Users/MyName/anaconda/lib/python3.5/site-packages/webkit_server.py", line 450, in get_default_server
_default_server = Server()
File "/Users/MyName/anaconda/lib/python3.5/site-packages/webkit_server.py", line 416, in __init__
stderr = subprocess.PIPE)
File "/Users/MyName/anaconda/lib/python3.5/subprocess.py", line 947, in __init__
restore_signals, start_new_session)
File "/Users/MyName/anaconda/lib/python3.5/subprocess.py", line 1551, in _execute_child
raise child_exception_type(errno_num, err_msg)
PermissionError: [Errno 13] Permission denied
我已经尝试 运行在终端中使用 sudo 将它连接起来,但我仍然遇到同样的错误。感谢您的帮助!注:所有答案我都会点赞,采纳最佳答案
我有这个工作:
# scrape.py
import dryscrape
s = dryscrape.Session()
s.visit("https://www.google.com/search?q={}".format('query'))
print(s.body().encode("utf-8"))
那应该打印 html
我这样做:
python scrape.py > results.html
然后在浏览器中打开results.html查看
这是文档中的一个非常基本的示例。
import dryscrape
import sys
if 'linux' in sys.platform:
# start xvfb in case no X is running. Make sure xvfb
# is installed, otherwise this won't work!
dryscrape.start_xvfb()
search_term = 'dryscrape'
# set up a web scraping session
sess = dryscrape.Session(base_url = 'http://google.com')
# we don't need images
sess.set_attribute('auto_load_images', False)
# visit homepage and search for a term
sess.visit('/')
q = sess.at_xpath('//*[@name="q"]')
q.set(search_term)
q.form().submit()
# extract all links
for link in sess.xpath('//a[@href]'):
print(link['href'])
# save a screenshot of the web page
sess.render('google.png')
print("Screenshot written to 'google.png'")
我正在尝试在 Mac 上进行干刮 session。我正在尝试 运行 的代码如下:
import dryscrape
session = dryscrape.Session(base_url = 'http://google.com')
但是当我 运行 它时,我得到了这个权限错误:
Traceback (most recent call last):
File "<ipython-input-37-5e3204f25ebb>", line 3, in <module>
session = dryscrape.Session(base_url = 'http://google.com')
File "/Users/MyName/anaconda/lib/python3.5/site-packages/dryscrape/session.py", line 22, in __init__
self.driver = driver or DefaultDriver()
File "/Users/MyName/anaconda/lib/python3.5/site-packages/dryscrape/driver/webkit.py", line 30, in __init__
super(Driver, self).__init__(**kw)
File "/Users/MyName/anaconda/lib/python3.5/site-packages/webkit_server.py", line 230, in __init__
self.conn = connection or ServerConnection()
File "/Users/MyName/anaconda/lib/python3.5/site-packages/webkit_server.py", line 507, in __init__
self._sock = (server or get_default_server()).connect()
File "/Users/MyName/anaconda/lib/python3.5/site-packages/webkit_server.py", line 450, in get_default_server
_default_server = Server()
File "/Users/MyName/anaconda/lib/python3.5/site-packages/webkit_server.py", line 416, in __init__
stderr = subprocess.PIPE)
File "/Users/MyName/anaconda/lib/python3.5/subprocess.py", line 947, in __init__
restore_signals, start_new_session)
File "/Users/MyName/anaconda/lib/python3.5/subprocess.py", line 1551, in _execute_child
raise child_exception_type(errno_num, err_msg)
PermissionError: [Errno 13] Permission denied
我已经尝试 运行在终端中使用 sudo 将它连接起来,但我仍然遇到同样的错误。感谢您的帮助!注:所有答案我都会点赞,采纳最佳答案
我有这个工作:
# scrape.py
import dryscrape
s = dryscrape.Session()
s.visit("https://www.google.com/search?q={}".format('query'))
print(s.body().encode("utf-8"))
那应该打印 html
我这样做:
python scrape.py > results.html
然后在浏览器中打开results.html查看
这是文档中的一个非常基本的示例。
import dryscrape
import sys
if 'linux' in sys.platform:
# start xvfb in case no X is running. Make sure xvfb
# is installed, otherwise this won't work!
dryscrape.start_xvfb()
search_term = 'dryscrape'
# set up a web scraping session
sess = dryscrape.Session(base_url = 'http://google.com')
# we don't need images
sess.set_attribute('auto_load_images', False)
# visit homepage and search for a term
sess.visit('/')
q = sess.at_xpath('//*[@name="q"]')
q.set(search_term)
q.form().submit()
# extract all links
for link in sess.xpath('//a[@href]'):
print(link['href'])
# save a screenshot of the web page
sess.render('google.png')
print("Screenshot written to 'google.png'")