使用多处理库时出错:"got multiple values for keyword argument 'x' "
Error using multiprocessing library: "got multiple values for keyword argument 'x' "
我正在尝试使用 python 中的多处理库并行化惩罚线性模型。
我创建了一个求解模型的函数:
from __future__ import division
import numpy as np
from cvxpy import *
def lm_lasso_solver(x, y, lambda1):
n = x.shape[0]
m = x.shape[1]
lambda1_param = Parameter(sign="positive")
betas_var = Variable(m)
response = dict(model='lm', penalization='l')
response["parameters"] = {"lambda_vector": lambda1}
lasso_penalization = lambda1_param * norm(betas_var, 1)
lm_penalization = 0.5 * sum_squares(y - x * betas_var)
objective = Minimize(lm_penalization + lasso_penalization)
problem = Problem(objective)
lambda1_param.value = lambda1
try:
problem.solve(solver=ECOS)
except:
try:
problem.solve(solver=CVXOPT)
except:
problem.solve(solver=SCS)
beta_sol = np.asarray(betas_var.value).flatten()
response["solution"] = beta_sol
return response
在此函数中,x 是预测变量矩阵,y 是响应变量。 lambda1 是必须优化的参数,因此是我想要并行化的参数。我将此脚本保存在名为 "ms.py"
的 python 文件中
然后我创建了另一个名为 "parallelization.py" 的 python 文件,并在该文件中定义了以下内容:
import multiprocessing as mp
import ms
import functools
def myFunction(x, y, lambda1):
pool = mp.Pool(processes=mp.cpu_count())
results = pool.map(functools.partial(ms.lm_lasso_solver, x=x, y=y), lambda1)
return results
所以现在的想法是,在 python 解释器上执行:
from sklearn.datasets import load_boston
boston = load_boston()
x = boston.data
y = boston.target
runfile('parallelization.py')
lambda_vector = np.array([1,2,3])
myFunction(x, y, lambda_vector)
但是当我这样做时,我收到以下错误消息:
问题在线:
results = pool.map(functools.partial(ms.lm_lasso_solver, x=x, y=y), lambda1)
您正在使用关键字参数调用 functools.partial()
方法,而在您的 lm_lasso_solver
方法中,您没有将它们定义为关键字参数。您应该使用 x
和 y
作为位置参数来调用它,如下所示:
results = pool.map(functools.partial(ms.lm_lasso_solver, x, y), lambda1)
或者简单地使用 apply_async()
方法池对象:
results = pool.apply_async(ms.lm_lasso_solver, args=[x, y, lambda1])
我正在尝试使用 python 中的多处理库并行化惩罚线性模型。
我创建了一个求解模型的函数:
from __future__ import division
import numpy as np
from cvxpy import *
def lm_lasso_solver(x, y, lambda1):
n = x.shape[0]
m = x.shape[1]
lambda1_param = Parameter(sign="positive")
betas_var = Variable(m)
response = dict(model='lm', penalization='l')
response["parameters"] = {"lambda_vector": lambda1}
lasso_penalization = lambda1_param * norm(betas_var, 1)
lm_penalization = 0.5 * sum_squares(y - x * betas_var)
objective = Minimize(lm_penalization + lasso_penalization)
problem = Problem(objective)
lambda1_param.value = lambda1
try:
problem.solve(solver=ECOS)
except:
try:
problem.solve(solver=CVXOPT)
except:
problem.solve(solver=SCS)
beta_sol = np.asarray(betas_var.value).flatten()
response["solution"] = beta_sol
return response
在此函数中,x 是预测变量矩阵,y 是响应变量。 lambda1 是必须优化的参数,因此是我想要并行化的参数。我将此脚本保存在名为 "ms.py"
的 python 文件中然后我创建了另一个名为 "parallelization.py" 的 python 文件,并在该文件中定义了以下内容:
import multiprocessing as mp
import ms
import functools
def myFunction(x, y, lambda1):
pool = mp.Pool(processes=mp.cpu_count())
results = pool.map(functools.partial(ms.lm_lasso_solver, x=x, y=y), lambda1)
return results
所以现在的想法是,在 python 解释器上执行:
from sklearn.datasets import load_boston
boston = load_boston()
x = boston.data
y = boston.target
runfile('parallelization.py')
lambda_vector = np.array([1,2,3])
myFunction(x, y, lambda_vector)
但是当我这样做时,我收到以下错误消息:
问题在线:
results = pool.map(functools.partial(ms.lm_lasso_solver, x=x, y=y), lambda1)
您正在使用关键字参数调用 functools.partial()
方法,而在您的 lm_lasso_solver
方法中,您没有将它们定义为关键字参数。您应该使用 x
和 y
作为位置参数来调用它,如下所示:
results = pool.map(functools.partial(ms.lm_lasso_solver, x, y), lambda1)
或者简单地使用 apply_async()
方法池对象:
results = pool.apply_async(ms.lm_lasso_solver, args=[x, y, lambda1])