aiohttp ClientSession.get() 方法静默失败 - Python3.7
aiohttp ClientSession.get() method failing silently - Python3.7
我正在制作一个小型应用程序,试图通过 Bing 搜索公司名称来找到公司网站 URL。它接受一个大的公司名称列表,使用 Bing 搜索 API 获取第一个 URL,并将那些 URL 保存回列表中。
我在使用 aiohttp
的 ClientSession.get()
方法时遇到问题,具体来说,它无声地失败了,我不明白为什么。
这是我初始化脚本的方式。留意 worker.perform_mission()
:
async def _execute(workers,*, loop=None):
if not loop:
loop = asyncio.get_event_loop()
[asyncio.ensure_future(i.perform_mission(verbose=True), loop=loop) for i in workers]
def main():
filepth = 'c:\SOME\FILE\PATH.xlsx'
cache = pd.read_excel(filepth)
# CHANGE THE NUMBER IN range(<here>) TO ADD MORE WORKERS.
workers = (Worker(cache) for i in range(1))
loop = asyncio.get_event_loop()
loop.run_until_complete(_execute(workers, loop=loop))
...<MORE STUFF>...
worker.perform_mission()
方法执行以下操作(滚动到底部并查看 _split_up_request_like_they_do_in_the_docs()
):
class Worker(object):
def __init__(self, shared_cache):
...<MORE STUFF>...
async def perform_mission(self, verbose=False):
while not self.mission_complete:
if not self.company_name:
await self.find_company_name()
if verbose:
print('Obtained Company Name')
if self.company_name and not self.website:
print('Company Name populated but no website found yet.')
data = await self.call_bing() #<<<<< THIS IS SILENTLY FAILING.
if self.website and ok_to_set_website(self.shared_cache, self):
await self.try_set_results(data)
self.mission_complete = True
else:
print('{} worker failed at setting website.'.format(self.company_name))
pass
else:
print('{} worker failed at obtaining data from Bing.'.format(self.company_name))
pass
async def call_bing(self):
async with aiohttp.ClientSession() as sesh:
sesh.headers = self.headers
sesh.params = self.params
return await self._split_up_request_like_they_do_in_the_docs(sesh)
async def _split_up_request_like_they_do_in_the_docs(self, session):
print('_bing_request() successfully called.') #<<<THIS CATCHES
async with session.get(self.search_url) as resp:
print('Session.get() successfully called.') #<<<THIS DOES NOT.
return await resp.json()
最后我的输出是:
Obtained Company Name
Company Name populated but no website found yet.
_bing_request() successfully called.
Process finished with exit code 0
谁能帮我弄清楚为什么 print('Session.get() successfully called.')
没有触发?...或者可以帮我更好地问这个问题?
看看这部分:
async def _execute(workers,*, loop=None):
# ...
[asyncio.ensure_future(i.perform_mission(verbose=True), loop=loop) for i in workers]
您创建了一堆任务,但没有等待这些任务完成。这意味着 _execute
本身将在任务创建后立即完成,远在这些任务完成之前。由于您 运行 事件循环直到 _execute
完成,它会在开始后不久停止。
要解决此问题,请使用 asyncio.gather 等待多个可等待对象完成:
async def _execute(workers,*, loop=None):
# ...
tasks = [asyncio.ensure_future(i.perform_mission(verbose=True), loop=loop) for i in workers]
await asyncio.gather(*tasks)
我正在制作一个小型应用程序,试图通过 Bing 搜索公司名称来找到公司网站 URL。它接受一个大的公司名称列表,使用 Bing 搜索 API 获取第一个 URL,并将那些 URL 保存回列表中。
我在使用 aiohttp
的 ClientSession.get()
方法时遇到问题,具体来说,它无声地失败了,我不明白为什么。
这是我初始化脚本的方式。留意 worker.perform_mission()
:
async def _execute(workers,*, loop=None):
if not loop:
loop = asyncio.get_event_loop()
[asyncio.ensure_future(i.perform_mission(verbose=True), loop=loop) for i in workers]
def main():
filepth = 'c:\SOME\FILE\PATH.xlsx'
cache = pd.read_excel(filepth)
# CHANGE THE NUMBER IN range(<here>) TO ADD MORE WORKERS.
workers = (Worker(cache) for i in range(1))
loop = asyncio.get_event_loop()
loop.run_until_complete(_execute(workers, loop=loop))
...<MORE STUFF>...
worker.perform_mission()
方法执行以下操作(滚动到底部并查看 _split_up_request_like_they_do_in_the_docs()
):
class Worker(object):
def __init__(self, shared_cache):
...<MORE STUFF>...
async def perform_mission(self, verbose=False):
while not self.mission_complete:
if not self.company_name:
await self.find_company_name()
if verbose:
print('Obtained Company Name')
if self.company_name and not self.website:
print('Company Name populated but no website found yet.')
data = await self.call_bing() #<<<<< THIS IS SILENTLY FAILING.
if self.website and ok_to_set_website(self.shared_cache, self):
await self.try_set_results(data)
self.mission_complete = True
else:
print('{} worker failed at setting website.'.format(self.company_name))
pass
else:
print('{} worker failed at obtaining data from Bing.'.format(self.company_name))
pass
async def call_bing(self):
async with aiohttp.ClientSession() as sesh:
sesh.headers = self.headers
sesh.params = self.params
return await self._split_up_request_like_they_do_in_the_docs(sesh)
async def _split_up_request_like_they_do_in_the_docs(self, session):
print('_bing_request() successfully called.') #<<<THIS CATCHES
async with session.get(self.search_url) as resp:
print('Session.get() successfully called.') #<<<THIS DOES NOT.
return await resp.json()
最后我的输出是:
Obtained Company Name
Company Name populated but no website found yet.
_bing_request() successfully called.
Process finished with exit code 0
谁能帮我弄清楚为什么 print('Session.get() successfully called.')
没有触发?...或者可以帮我更好地问这个问题?
看看这部分:
async def _execute(workers,*, loop=None):
# ...
[asyncio.ensure_future(i.perform_mission(verbose=True), loop=loop) for i in workers]
您创建了一堆任务,但没有等待这些任务完成。这意味着 _execute
本身将在任务创建后立即完成,远在这些任务完成之前。由于您 运行 事件循环直到 _execute
完成,它会在开始后不久停止。
要解决此问题,请使用 asyncio.gather 等待多个可等待对象完成:
async def _execute(workers,*, loop=None):
# ...
tasks = [asyncio.ensure_future(i.perform_mission(verbose=True), loop=loop) for i in workers]
await asyncio.gather(*tasks)