Python - 运行 许多不同类型的线程,不会耗尽资源

Python - Running many different types of threads without exhausting resources

我正在尝试做一些清洁工作 multi-threading 但也许我的方法是错误的。

我有一个 BaseFix 线程 class 有 6 种不同类型的 children,其中一种是 CommandFix 线程,另一种是 ParamFix 线程。

他们的 run 函数内部有不同的代码,但主要使用 subprocess.run 函数。

class CommandFix(BaseFix):

    def __init__(self, group=None, target=None, name=None,
             args=(), kwargs=None, verbose=None):

        super(CommandFix, self).__init__(group=group, target=target, 
                      name=name, kwargs=kwargs, verbose=None)


    def _execute_command(self):
        self.stdout_log.write("fixing {} : running command {}".format(
            self.rule_id,
            self.command))

        try:
            cp = subprocess.run(self.command,
                shell=True, check=True,
                stdout=self.stdout_log,
                stderr=self.stderr_log
                )
        except subprocess.CalledProcessError as e:
            self.stderr_log.write("error fixing {} : {}".format(
                self.rule_id,
                e))

            # TO-DO: Handle error with retry or similar
            # right now just return false
            return False

        return True

    def _verify_command(self):
        pass

    def run(self):

        # execute the command
        if self._execute_command():
            # now run the verification
            pass
        else:
            # TO-DO: error handling
            pass

我有大约 200 个大约 6 不同类型的修复程序。我正在检查它们的修复类型,然后实例化所需的修复并像这样调用 run

def _fix(key, rule):
    expr = "./{}fixtext".format(NS)
    fixtext = rule.find(expr)

    expr = "./{}div".format(NS1)
    fixtext = fixtext.find(expr).getchildren()[0].getchildren()[0].text

    if fixtext.find('Run the following command') > 0:
        fix = CommandFix(rule, stdout_log, stderr_log, key)

    elif fixtext.find('Set the following') > 0:
        # fix = SomeOtherFix()
        pass

    elif fixtext.find('For 32 bit systems') > 0:
        # fix = SomeOtherKindOfFix()
        pass

    fix.run()

# there are only 200 but assume there are 2 million
for key, rule in rules.items():
    _fix(key, rule)

我想做的是清理它,这样我一次只执行 X number of fixes,这样我就不会耗尽系统资源。

正如您在问题的标签中所说:使用游泳池。只要您的 *Fix() 调用 subprocess.run(),它们就会产生新的进程以实现真正的并行性。因此,一个线程池就足够了:

from multiprocessing.dummy import Pool    # wrapper around the threading module
#from multiprocessing import Pool         # real processes instead of threads
[...]
pool = Pool(5)                            # number of fixes at once
pool.starmap(_fix, rules.items())