Python izip 和重复内存行为

Question

我有嵌套函数调用，其中还应用了多处理。 izip 或 repeat 或某些东西似乎在复制对象而不是通过引用传递，同时也完成了一些打包和解包。

这里是调用顺序的结构：

def main():
    print 'Rel_list id main: %s' % str(id(rel_list))
    par_objective(folder.num_proc, batch, r, folder.d, vocab_len, \
                                          rel_list, lambdas)

def par_objective(num_proc, data, params, d, len_voc, rel_list, lambdas):
    pool = Pool(processes=num_proc) 

    # non-data params
    oparams = [params, d, len_voc, rel_list]

    print 'Rel_list id paro: %s' % str(id(rel_list))
    result = pool.map(objective_and_grad, izip(repeat(oparams),split_data))


 def objective_and_grad(par_data):
    (params, d, len_voc, rel_list),data = par_data

    print 'Rel_list id obag: %s' % str(id(rel_list))

输出：

ID IN MAIN
Rel_list id main: 140694049352088
ID IN PAR_OBJECTIVE
Rel_list id paro: 140694049352088
IDs IN OBJECTIVE_AND_GRAD (24 Processes):
Rel_list id obag: 140694005483424
Rel_list id obag: 140694005481840
Rel_list id obag: 140694311306232
Rel_list id obag: 140694048889168
Rel_list id obag: 140694057601144
Rel_list id obag: 140694054472232
Rel_list id obag: 140694273611104
Rel_list id obag: 140693878744632
Rel_list id obag: 140693897912976
Rel_list id obag: 140693753182328
Rel_list id obag: 140694282174976
Rel_list id obag: 140693900442800
Rel_list id obag: 140694271314328
Rel_list id obag: 140694276073736
Rel_list id obag: 140694020435696
Rel_list id obag: 140693901952208
Rel_list id obag: 140694694615376
Rel_list id obag: 140694271773512
Rel_list id obag: 140693899163264
Rel_list id obag: 140694047135792
Rel_list id obag: 140694276808432
Rel_list id obag: 140694019346088
Rel_list id obag: 140693897455016
Rel_list id obag: 140694067166024
Rel_list id obag: 140694278467024
Rel_list id obag: 140694010924280
Rel_list id obag: 140694026060576

BACK TO MAIN, RINSE AND REPEAT
Rel_list id main: 140694049352088
Rel_list id paro: 140694049352088

如您所见，列表的 id 在 main() 和 par_obj() 中相同，但在传递到多处理池时会发生变化

multiprocessing 以写时复制的方式分叉，列表永远不会改变，但 id 改变了，这是否意味着内存被复制或只是 id 改变了？

有什么方法可以检查内存是否被复制？如果那些是复制品，为什么要复制它们？

Answer 1

您的 python 个对象正在修改；您正在创建对它们的其他引用，因此对象中的引用计数被更改并且 由 OS.

创建了一个副本

子进程需要访问的任何 Python 对象都必须具有独立于主进程的引用计数。因此 Python 多处理永远不会简单地使用相同的内存区域，总是需要一个副本。

Python izip 和重复内存行为

Python izip and repeat memory behaviour

python

repeat

izip