运行相同的代码，但有两个不同的数据集（输入）

Question

我在 JupyterLab 中有一段代码，其中包含分布在多个单元格中的多个函数。第一个函数生成一个数据集，该数据集将在其后的所有其他函数中使用。

我要做的是运行两次相同的代码，但修改了其中一个函数。所以它看起来像这样：

data_generating_function() # this function should only be ran once so it generates the same dataset for both trials 
function_1() # this is the function that is to be modified once, so there are two version of this function
function_2() # this function and all functions below it stay the same but should be ran twice
function_3()
function_4()
function_5()

所以我会运行 data_generating_function() 一次并生成数据集。然后我会运行一个版本的 function1() 和它下面的所有函数，然后我会运行另一个版本的 function1() 和它下面的所有其他函数。

实现这个的好方法是什么？我显然可以复制代码并更改一些函数名称，我也可以将它们全部放入一个单元格中并创建一个 for 循环。然而，有没有更好的方法也能理想地保存多个细胞？

谢谢

Answer 1

简单地迭代第一个函数的两个选择：

data_generating_function() 
for func1 in (function1a, function1b):
    func1()
    function_2()
    function_3()
    function_4()
    function_5()

Answer 2

您应该尽可能避免修改或直接迭代函数。在这种情况下，最好的做法是向 function1 添加一个布尔参数，指定要运行的函数版本。它看起来像这样：

def function1(isFirstTime):
  if isFirstTime:
    # do stuff the first time
    pass
  else:
    # do stuff the second time
    pass

然后您可以迭代函数：

data_generating_function()
for b in (True, False):
  function1(b)
  function2()
  function3()
  # ...

Answer 3

抱歉，如果我误解了这个问题，但你能不能不这样做：

单元格 1：

# define all functions

单元格 2：

dataset = data_generating_function()

单元格 3：

# Run version 1 of function 1 on dataset
result_1_1 = function_1_v1(dataset)
result_2_1 = function_2(result_1_1)
result_3_1 = function_3(result_2_1)
function_4(result_3_1)

单元格 4：

# Run version 2 of function 1 on dataset
result_1_2 = function_1_v2(dataset)
result_2_2 = function_2(result_1_2)
result_3_2 = function_3(result_2_2)
function_4(result_3_2)

此解决方案假设：

您使用 return 值定义函数
传递结果并不“昂贵”

如果不是这种情况，您也可以将结果保存在文件中。

为了减少 function_1 中的代码重复，您可以添加一个在两个版本之间切换的参数。

运行相同的代码，但有两个不同的数据集（输入）

Running the same code but with two different datasets (inputs)

python

function

jupyter-notebook

jupyter-lab

运行 相同的代码，但有两个不同的数据集（输入）

Running the same code but with two different datasets (inputs)

python

function

jupyter-notebook

jupyter-lab

运行相同的代码，但有两个不同的数据集（输入）