达到 max_instances 时停止 运行 实例
Stop the running instances when max_instances is reached
我正在使用 apscheduler-django 并创建了一个每 10 秒循环一次的任务。
此函数将向 API 发出请求并将内容保存到我的数据库 (PostgreSQL)。
这是我的任务:
scheduler.add_job(
SaveAPI,
trigger=CronTrigger(second="*/10"),
id="SaveAPI",
max_instances=1,
replace_existing=True,
)
我的 SaveAPI
是:
def SaveAPI():
SPORT = 3
print('API Queue Started')
AllMatches = GetAllMatches(SPORT)
for Match in AllMatches:
AddToDatabase(Match, SPORT)
print(f'API Queue Ended')
GetAllMatches
和 AddToDatabase
太大了,我认为这些实现与我的问题无关。
我的问题是有时我会得到这个错误:
Run time of job "SaveAPI (trigger: cron[second='*/10'], next run at: 2022-03-05 23:21:00 +0330)" was missed by 0:00:11.445357
发生这种情况时,它不会被新实例替换,因为我的 SaveAPI
函数没有结束。而且 apscheduler 总是会错过新的实例。
我做了很多测试,功能没有任何问题。
如果要错过一个新实例,我如何让 apscheduler 停止最后一个 运行 实例?
因此,如果我的最后一个实例花费的时间超过 10 秒,我只想终止该实例并创建一个新实例。
apscheduler
和 apscheduler-django
不直接支持。
您可以实施并使用 custom executor 来跟踪进程 运行 作业并在尝试提交当前 运行 的作业时终止进程。
这是一个使用 pebble.ProcessPool
的 MaxInstancesCancelEarliestProcessPoolExecutor
。
class MaxInstancesCancelEarliestProcessPoolExecutor(BasePoolExecutor):
def __init__(self):
pool = ProcessPool()
pool.submit = lambda function, *args: pool.schedule(function, args=args)
super().__init__(pool)
self._futures = defaultdict(list)
def submit_job(self, job, run_times):
assert self._lock is not None, 'This executor has not been started yet'
with self._lock:
if self._instances[job.id] >= job.max_instances:
f = self._futures[job.id][0] # +
f.cancel() # +
try: # +
self._pool._pool_manager.update_status() # +
except RuntimeError: # +
pass # +
if self._instances[job.id] >= job.max_instances: # +
raise MaxInstancesReachedError(job)
self._do_submit_job(job, run_times)
self._instances[job.id] += 1
def _do_submit_job(self, job, run_times):
def callback(f):
with self._lock: # +
self._futures[job.id].remove(f) # +
try: # +
exc, tb = (f.exception_info() if hasattr(f, 'exception_info') else
(f.exception(), getattr(f.exception(), '__traceback__', None)))
except CancelledError: # +
exc, tb = TimeoutError(), None # +
if exc:
self._run_job_error(job.id, exc, tb)
else:
self._run_job_success(job.id, f.result())
try:
f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name)
except BrokenProcessPool:
self._logger.warning('Process pool is broken; replacing pool with a fresh instance')
self._pool = self._pool.__class__(self._pool._max_workers)
f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name)
f.add_done_callback(callback)
self._futures[job.id].append(f) # +
def shutdown(self, wait=True):
if wait:
self._pool.close()
self._pool.join()
else:
self._pool.close()
threading.Thread(target=self._pool.join).start()
用法:
scheduler.add_executor(MaxInstancesCancelEarliestProcessPoolExecutor(), alias='max_instances_cancel_earliest')
scheduler.add_job(
SaveAPI,
trigger=CronTrigger(second="*/10"),
id="SaveAPI",
max_instances=1,
executor='max_instances_cancel_earliest', # +
replace_existing=True,
)
我正在使用 apscheduler-django 并创建了一个每 10 秒循环一次的任务。
此函数将向 API 发出请求并将内容保存到我的数据库 (PostgreSQL)。
这是我的任务:
scheduler.add_job(
SaveAPI,
trigger=CronTrigger(second="*/10"),
id="SaveAPI",
max_instances=1,
replace_existing=True,
)
我的 SaveAPI
是:
def SaveAPI():
SPORT = 3
print('API Queue Started')
AllMatches = GetAllMatches(SPORT)
for Match in AllMatches:
AddToDatabase(Match, SPORT)
print(f'API Queue Ended')
GetAllMatches
和 AddToDatabase
太大了,我认为这些实现与我的问题无关。
我的问题是有时我会得到这个错误:
Run time of job "SaveAPI (trigger: cron[second='*/10'], next run at: 2022-03-05 23:21:00 +0330)" was missed by 0:00:11.445357
发生这种情况时,它不会被新实例替换,因为我的 SaveAPI
函数没有结束。而且 apscheduler 总是会错过新的实例。
我做了很多测试,功能没有任何问题。
如果要错过一个新实例,我如何让 apscheduler 停止最后一个 运行 实例?
因此,如果我的最后一个实例花费的时间超过 10 秒,我只想终止该实例并创建一个新实例。
apscheduler
和 apscheduler-django
不直接支持。
您可以实施并使用 custom executor 来跟踪进程 运行 作业并在尝试提交当前 运行 的作业时终止进程。
这是一个使用 pebble.ProcessPool
的 MaxInstancesCancelEarliestProcessPoolExecutor
。
class MaxInstancesCancelEarliestProcessPoolExecutor(BasePoolExecutor):
def __init__(self):
pool = ProcessPool()
pool.submit = lambda function, *args: pool.schedule(function, args=args)
super().__init__(pool)
self._futures = defaultdict(list)
def submit_job(self, job, run_times):
assert self._lock is not None, 'This executor has not been started yet'
with self._lock:
if self._instances[job.id] >= job.max_instances:
f = self._futures[job.id][0] # +
f.cancel() # +
try: # +
self._pool._pool_manager.update_status() # +
except RuntimeError: # +
pass # +
if self._instances[job.id] >= job.max_instances: # +
raise MaxInstancesReachedError(job)
self._do_submit_job(job, run_times)
self._instances[job.id] += 1
def _do_submit_job(self, job, run_times):
def callback(f):
with self._lock: # +
self._futures[job.id].remove(f) # +
try: # +
exc, tb = (f.exception_info() if hasattr(f, 'exception_info') else
(f.exception(), getattr(f.exception(), '__traceback__', None)))
except CancelledError: # +
exc, tb = TimeoutError(), None # +
if exc:
self._run_job_error(job.id, exc, tb)
else:
self._run_job_success(job.id, f.result())
try:
f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name)
except BrokenProcessPool:
self._logger.warning('Process pool is broken; replacing pool with a fresh instance')
self._pool = self._pool.__class__(self._pool._max_workers)
f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name)
f.add_done_callback(callback)
self._futures[job.id].append(f) # +
def shutdown(self, wait=True):
if wait:
self._pool.close()
self._pool.join()
else:
self._pool.close()
threading.Thread(target=self._pool.join).start()
用法:
scheduler.add_executor(MaxInstancesCancelEarliestProcessPoolExecutor(), alias='max_instances_cancel_earliest')
scheduler.add_job(
SaveAPI,
trigger=CronTrigger(second="*/10"),
id="SaveAPI",
max_instances=1,
executor='max_instances_cancel_earliest', # +
replace_existing=True,
)