Pythonic 使用迭代和条件
Pythonic use of iterations and conditionals
这是一个稍微笼统的问题 - 我正在寻找最 pythonic 和/或最有效的方法:
我有一个大型数据集和许多任务,有时需要通过遍历行来执行,有时不需要,具体取决于某些条件。
for step in np.arange (0, number_of_steps):
if condition1:
do_calculation1(step)
if condition2:
do_calculation2(step)
因此,if
语句在每次迭代中都会重复。整个数据集的条件为真或假,为了节省时间,如果不需要,我不会做迭代:
if condition1 or condition2:
for step in np.arange (0, number_of_steps):
if condition1:
do_calculation1(step)
if condition2:
do_calculation2(step)
但有时我仍在不必要地重复 if
语句。
另一种方法是将条件语句分开并通过数据集进行两次迭代:
if condition1 :
for step in np.arange (0, number_of_steps):
do_calculation1(step)
if condition2:
for step in np.arange (0, number_of_steps):
do_calculation2(step)
这样做的缺点是,如果两个条件都为真,我迭代两次,很慢(而且很笨拙)。这两种方法的相对速度将取决于每个条件成立的频率,但我将使用各种各样的数据,所以我不知道哪个更快。
所以我的问题是哪种方法最符合 Python 风格且最有效?
您可以简单地结合两种方法:
if not condition1 and not condition2:
pass
elif not condition1 and condition2
for step in np.arange (0, number_of_steps):
do_calculation1(step)
elif condition1 and not condition2:
for step in np.arange (0, number_of_steps):
do_calculation2(step)
else: # condition 1 and condition2:
for step in np.arange (0, number_of_steps):
do_calculation1(step)
do_calculation2(step)
我想这更多是一个效率问题,而不是什么更 pythonic。
我想这会更符合 pythonic:
def run_calcs(number_of_steps, *funcs):
for step in range(number_of_steps):
for func in funcs:
func(step)
def gen_func_list(condition1=False, condition2=False):
func_list = []
if condition1:
func_list.append(do_calculation1)
if condition2:
func_list.append(do_calculation2)
return func_list
if __name__ == '__main__':
number_of_steps = 10
run_calcs(
number_of_steps,
*gen_func_list(
condition1=<your condition here>,
condition2=<your condition here>
)
)
我认为这也很易读并且适合多处理:
from multiprocessing import Process
def run_calcs(number_of_steps, *funcs):
for step in range(number_of_steps):
for func in funcs:
func(step)
def gen_func_list(condition1=True, condition2=True):
func_list = []
if condition1:
func_list.append(do_calculation1)
if condition2:
func_list.append(do_calculation2)
return func_list
if __name__ == '__main__':
number_of_steps = 10
funcs = gen_func_list(
condition1=<your condition here>,
condition2=<your condition here>
)
proc_handles = []
for f in funcs:
proc_handles.append(
Process(target=run_calcs,
args=[number_of_steps, f])
)
for p in proc_handles:
p.start()
for p in proc_handles:
p.join()
我认为第一个是更 pythonic 的方法。但是如果你真的想在没有满足条件的情况下跳过迭代,你可以添加一个 break 语句:
for ... :
if not any(condition 1, condition2):
break
else:
if condition1:
...
if condition2:
...
这将允许您避免第一步的迭代,或者检查两个条件是否满足。
(抱歉格式问题,从 phone 输入。)
这就是我要做的:
calculations = [
f for c,f in [
(condition1, do_calculation1),
(condition2, do_calculation2),
] if c
]
if calculations:
for step in np.arange (0, number_of_steps):
for calc in calculations:
calc(step)
我认为第一种方法是最好的,因为你的数据集要么有条件 (1 and/or 2),要么没有。
for step in np.arange (0, number_of_steps):
if condition1:
do_calculation1(step)
if condition2:
do_calculation2(step)
如果两个条件互斥,那么您可以用 elif 替换第二个 if。这将节省一些计算。
for step in np.arange (0, number_of_steps):
if condition1:
do_calculation1(step)
elif condition2:
do_calculation2(step)
这是一个稍微笼统的问题 - 我正在寻找最 pythonic 和/或最有效的方法:
我有一个大型数据集和许多任务,有时需要通过遍历行来执行,有时不需要,具体取决于某些条件。
for step in np.arange (0, number_of_steps):
if condition1:
do_calculation1(step)
if condition2:
do_calculation2(step)
因此,if
语句在每次迭代中都会重复。整个数据集的条件为真或假,为了节省时间,如果不需要,我不会做迭代:
if condition1 or condition2:
for step in np.arange (0, number_of_steps):
if condition1:
do_calculation1(step)
if condition2:
do_calculation2(step)
但有时我仍在不必要地重复 if
语句。
另一种方法是将条件语句分开并通过数据集进行两次迭代:
if condition1 :
for step in np.arange (0, number_of_steps):
do_calculation1(step)
if condition2:
for step in np.arange (0, number_of_steps):
do_calculation2(step)
这样做的缺点是,如果两个条件都为真,我迭代两次,很慢(而且很笨拙)。这两种方法的相对速度将取决于每个条件成立的频率,但我将使用各种各样的数据,所以我不知道哪个更快。
所以我的问题是哪种方法最符合 Python 风格且最有效?
您可以简单地结合两种方法:
if not condition1 and not condition2:
pass
elif not condition1 and condition2
for step in np.arange (0, number_of_steps):
do_calculation1(step)
elif condition1 and not condition2:
for step in np.arange (0, number_of_steps):
do_calculation2(step)
else: # condition 1 and condition2:
for step in np.arange (0, number_of_steps):
do_calculation1(step)
do_calculation2(step)
我想这更多是一个效率问题,而不是什么更 pythonic。
我想这会更符合 pythonic:
def run_calcs(number_of_steps, *funcs):
for step in range(number_of_steps):
for func in funcs:
func(step)
def gen_func_list(condition1=False, condition2=False):
func_list = []
if condition1:
func_list.append(do_calculation1)
if condition2:
func_list.append(do_calculation2)
return func_list
if __name__ == '__main__':
number_of_steps = 10
run_calcs(
number_of_steps,
*gen_func_list(
condition1=<your condition here>,
condition2=<your condition here>
)
)
我认为这也很易读并且适合多处理:
from multiprocessing import Process
def run_calcs(number_of_steps, *funcs):
for step in range(number_of_steps):
for func in funcs:
func(step)
def gen_func_list(condition1=True, condition2=True):
func_list = []
if condition1:
func_list.append(do_calculation1)
if condition2:
func_list.append(do_calculation2)
return func_list
if __name__ == '__main__':
number_of_steps = 10
funcs = gen_func_list(
condition1=<your condition here>,
condition2=<your condition here>
)
proc_handles = []
for f in funcs:
proc_handles.append(
Process(target=run_calcs,
args=[number_of_steps, f])
)
for p in proc_handles:
p.start()
for p in proc_handles:
p.join()
我认为第一个是更 pythonic 的方法。但是如果你真的想在没有满足条件的情况下跳过迭代,你可以添加一个 break 语句:
for ... :
if not any(condition 1, condition2):
break
else:
if condition1:
...
if condition2:
...
这将允许您避免第一步的迭代,或者检查两个条件是否满足。
(抱歉格式问题,从 phone 输入。)
这就是我要做的:
calculations = [
f for c,f in [
(condition1, do_calculation1),
(condition2, do_calculation2),
] if c
]
if calculations:
for step in np.arange (0, number_of_steps):
for calc in calculations:
calc(step)
我认为第一种方法是最好的,因为你的数据集要么有条件 (1 and/or 2),要么没有。
for step in np.arange (0, number_of_steps):
if condition1:
do_calculation1(step)
if condition2:
do_calculation2(step)
如果两个条件互斥,那么您可以用 elif 替换第二个 if。这将节省一些计算。
for step in np.arange (0, number_of_steps):
if condition1:
do_calculation1(step)
elif condition2:
do_calculation2(step)