Python 使用aiohttp lib美汤
Python lib beautiful soup using aiohttp
有人知道怎么做:
import html5lib
import urllib
from bs4 import BeautifulSoup
soup = BeautifulSoup(urllib.request.urlopen('http://someWebSite.com').read().decode('utf-8'), 'html5lib')
使用 aiohttp 而不是 urllib?
谢谢^^
你可以这样做:
import asyncio
import aiohttp
import html5lib
from bs4 import BeautifulSoup
SELECTED_URL = 'http://someWebSite.com'
async def get_site_content():
async with aiohttp.ClientSession() as session:
async with session.get(SELECTED_URL) as resp:
text = await resp.read()
return BeautifulSoup(text.decode('utf-8'), 'html5lib')
loop = asyncio.get_event_loop()
sites_soup = loop.run_until_complete(get_site_content())
print(sites_soup)
loop.close()
仅供正在寻找更多答案的人使用:
还有另一种方法 运行 循环中的同步代码:loop.run_in_executor.
更多文档:https://docs.python.org/3/library/asyncio-eventloop.html#asyncio.loop.run_in_executor
示例代码:
import asyncio
import time
def blocking_func():
time.sleep(5)
return 42
async def main(loop):
result = await loop.run_in_executor(None, blocking_func)
return result
loop = asyncio.get_event_loop()
loop_result = loop.run_until_complete(main(loop))
print(loop_result) # => 42
因此,您可以像使用协程一样等待您的任务
有人知道怎么做:
import html5lib
import urllib
from bs4 import BeautifulSoup
soup = BeautifulSoup(urllib.request.urlopen('http://someWebSite.com').read().decode('utf-8'), 'html5lib')
使用 aiohttp 而不是 urllib?
谢谢^^
你可以这样做:
import asyncio
import aiohttp
import html5lib
from bs4 import BeautifulSoup
SELECTED_URL = 'http://someWebSite.com'
async def get_site_content():
async with aiohttp.ClientSession() as session:
async with session.get(SELECTED_URL) as resp:
text = await resp.read()
return BeautifulSoup(text.decode('utf-8'), 'html5lib')
loop = asyncio.get_event_loop()
sites_soup = loop.run_until_complete(get_site_content())
print(sites_soup)
loop.close()
仅供正在寻找更多答案的人使用: 还有另一种方法 运行 循环中的同步代码:loop.run_in_executor.
更多文档:https://docs.python.org/3/library/asyncio-eventloop.html#asyncio.loop.run_in_executor
示例代码:
import asyncio
import time
def blocking_func():
time.sleep(5)
return 42
async def main(loop):
result = await loop.run_in_executor(None, blocking_func)
return result
loop = asyncio.get_event_loop()
loop_result = loop.run_until_complete(main(loop))
print(loop_result) # => 42
因此,您可以像使用协程一样等待您的任务