Pythonic 使用迭代和条件

Pythonic use of iterations and conditionals

这是一个稍微笼统的问题 - 我正在寻找最 pythonic 和/或最有效的方法:

我有一个大型数据集和许多任务,有时需要通过遍历行来执行,有时不需要,具体取决于某些条件。

for step in np.arange (0, number_of_steps):
    if condition1:
        do_calculation1(step)
    if condition2:
        do_calculation2(step)

因此,if 语句在每次迭代中都会重复。整个数据集的条件为真或假,为了节省时间,如果不需要,我不会做迭代:

if condition1 or condition2:
    for step in np.arange (0, number_of_steps):
            if condition1:
                do_calculation1(step)
            if condition2:
                do_calculation2(step)

但有时我仍在不必要地重复 if 语句。 另一种方法是将条件语句分开并通过数据集进行两次迭代:

if condition1 :
    for step in np.arange (0, number_of_steps):
        do_calculation1(step)
if condition2:
    for step in np.arange (0, number_of_steps):
        do_calculation2(step)

这样做的缺点是,如果两个条件都为真,我迭代两次,很慢(而且很笨拙)。这两种方法的相对速度将取决于每个条件成立的频率,但我将使用各种各样的数据,所以我不知道哪个更快。

所以我的问题是哪种方法最符合 Python 风格且最有效?

您可以简单地结合两种方法:

if not condition1 and not condition2:
    pass
elif not condition1 and condition2
    for step in np.arange (0, number_of_steps):
        do_calculation1(step)
elif condition1 and not condition2:
    for step in np.arange (0, number_of_steps):
        do_calculation2(step)
else: # condition 1 and condition2:
    for step in np.arange (0, number_of_steps):
        do_calculation1(step)
        do_calculation2(step)

我想这更多是一个效率问题,而不是什么更 pythonic。

我想这会更符合 pythonic:

def run_calcs(number_of_steps, *funcs):
    for step in range(number_of_steps):
        for func in funcs:
            func(step)

def gen_func_list(condition1=False, condition2=False):
    func_list = []
    if condition1:
        func_list.append(do_calculation1)
    if condition2:
        func_list.append(do_calculation2)
    return func_list

if __name__ == '__main__':

    number_of_steps = 10

    run_calcs(
        number_of_steps,
        *gen_func_list(
            condition1=<your condition here>,
            condition2=<your condition here>
        )
    )

我认为这也很易读并且适合多处理:

from multiprocessing import Process

def run_calcs(number_of_steps, *funcs):
    for step in range(number_of_steps):
        for func in funcs:
            func(step)

def gen_func_list(condition1=True, condition2=True):
    func_list = []
    if condition1:
        func_list.append(do_calculation1)
    if condition2:
        func_list.append(do_calculation2)
    return func_list

if __name__ == '__main__':

    number_of_steps = 10

    funcs = gen_func_list(
            condition1=<your condition here>,
            condition2=<your condition here>
    )

    proc_handles = []
    for f in funcs:
        proc_handles.append(
            Process(target=run_calcs,
                    args=[number_of_steps, f])
        )

    for p in proc_handles:
        p.start()

    for p in proc_handles:
        p.join()

我认为第一个是更 pythonic 的方法。但是如果你真的想在没有满足条件的情况下跳过迭代,你可以添加一个 break 语句:

for ... :
    if not any(condition 1, condition2):
        break
    else:
        if condition1:
            ...
        if condition2:
            ...

这将允许您避免第一步的迭代,或者检查两个条件是否满足。

(抱歉格式问题,从 phone 输入。)

这就是我要做的:

calculations = [
    f for c,f in [
        (condition1, do_calculation1),
        (condition2, do_calculation2),
    ] if c
]
if calculations:
    for step in np.arange (0, number_of_steps):
        for calc in calculations:
            calc(step)

我认为第一种方法是最好的,因为你的数据集要么有条件 (1 and/or 2),要么没有。

for step in np.arange (0, number_of_steps):
if condition1:
    do_calculation1(step)
if condition2:
    do_calculation2(step)

如果两个条件互斥,那么您可以用 elif 替换第二个 if。这将节省一些计算。

for step in np.arange (0, number_of_steps):
if condition1:
    do_calculation1(step)
elif condition2:
    do_calculation2(step)