Scipy 最小化表示成功,然后继续并出现警告
Scipy minimization says it is successful, then continues with warnings
我正在尝试最小化一个函数。我正在显示 scipy 在运行时取得的进展。显示的第一条消息是 。 . .
Optimization terminated successfully.
Current function value: 0.000113
Iterations: 32
Function evaluations: 13299
Gradient evaluations: 33
这看起来很有希望。问题是进程没有终止。事实上,它会继续显示
之类的消息
Warning: Maximum number of iterations has been exceeded.
Current function value: 0.023312
Iterations: 50
Function evaluations: 20553
Gradient evaluations: 51
Warning: Maximum number of iterations has been exceeded.
Current function value: 0.068360
Iterations: 50
Function evaluations: 20553
Gradient evaluations: 51
Warning: Maximum number of iterations has been exceeded.
Current function value: 0.071812
Iterations: 50
Function evaluations: 20553
Gradient evaluations: 51
Warning: Maximum number of iterations has been exceeded.
Current function value: 0.050061
Iterations: 50
Function evaluations: 20553
Gradient evaluations: 51
下面是调用最小化的代码:
def one_vs_all(X, y, num_labels, lmbda):
# store dimensions of X that will be reused
m = X.shape[0]
n = X.shape[1]
# append ones vector to X matrix
X = np.column_stack((np.ones((X.shape[0], 1)),X))
# create vector in which thetas will be returned
all_theta = np.zeros((num_labels, n+1))
# choose initial thetas
#init_theta = np.zeros((n+1, 1))
for i in np.arange(num_labels):
# note theta should be first arg in objective func signature followed by X and y
init_theta = np.zeros((n+1,1))
theta = minimize(lrCostFunctionReg, x0=init_theta, args=(X, (y == i)*1, lmbda),
options={'disp':True, 'maxiter':50})
all_theta[i] = theta.x
return all_theta
我尝试过更改最小化方法,将迭代次数从低至 30 次更改为高达 1000 次。我还尝试过提供我自己的梯度函数。在所有情况下,例程确实最终提供了答案,但它是完全错误的。有人知道发生了什么事吗?
编辑:
函数是可微分的。这是成本函数,然后是它的梯度(未正则化,然后正则化)。
def lrCostFunctionReg(theta, X, y, lmbda):
m = X.shape[0]
# unregularized cost
h = sigmoid(X @ theta)
# calculate regularization term
reg_term = ((lmbda / (2*m)) * (theta[1:,].T @ theta[1:,]))
cost_reg = (1/m) * (-(y.T @ np.log(h)) - ((1 - y).T @ np.log(1 - h))) + reg_term
return cost_reg
def gradFunction(theta, X, y):
m = X.shape[0]
theta = np.reshape(theta,(theta.size,1))
# hypothesis as generated in cost function
h = sigmoid(X@theta)
# unregularized gradient
grad = (1/m) * np.dot(X.T, (h-y))
return grad
def lrGradFunctionReg(theta, X, y, lmbda):
m = X.shape[0]
# theta reshaped to ensure proper operation
theta = np.reshape(theta,(theta.size,1))
# generate unregularized gradient
grad = gradFunction(theta, X, y)
# calc regularized gradient w/o touching intercept; essential that only 1 index used
grad[1:,] = ((lmbda / m) * theta[1:,]) + grad[1:,]
return grad.flatten()
为了回答我自己的问题,问题原来是矢量形状问题。我喜欢二维编码,但 SciPy 优化例程仅适用于已“展平”为数组的列向量和行向量。多维矩阵还好,列向量和行向量太过分了。
例如,如果 y 是标签向量并且 y.shape 是 (400,1),则需要对 y 使用 y.flatten(),这会使 y.shape = (400,)。然后 SciPy 将在假设所有其他维度都有意义的情况下处理您的数据。
因此,如果您将 MATLAB 机器学习代码转换为 Python 的努力停滞不前,请检查以确保您已展平行和列向量,尤其是那些由梯度函数返回的向量。
我正在尝试最小化一个函数。我正在显示 scipy 在运行时取得的进展。显示的第一条消息是 。 . .
Optimization terminated successfully.
Current function value: 0.000113
Iterations: 32
Function evaluations: 13299
Gradient evaluations: 33
这看起来很有希望。问题是进程没有终止。事实上,它会继续显示
之类的消息Warning: Maximum number of iterations has been exceeded.
Current function value: 0.023312
Iterations: 50
Function evaluations: 20553
Gradient evaluations: 51
Warning: Maximum number of iterations has been exceeded.
Current function value: 0.068360
Iterations: 50
Function evaluations: 20553
Gradient evaluations: 51
Warning: Maximum number of iterations has been exceeded.
Current function value: 0.071812
Iterations: 50
Function evaluations: 20553
Gradient evaluations: 51
Warning: Maximum number of iterations has been exceeded.
Current function value: 0.050061
Iterations: 50
Function evaluations: 20553
Gradient evaluations: 51
下面是调用最小化的代码:
def one_vs_all(X, y, num_labels, lmbda):
# store dimensions of X that will be reused
m = X.shape[0]
n = X.shape[1]
# append ones vector to X matrix
X = np.column_stack((np.ones((X.shape[0], 1)),X))
# create vector in which thetas will be returned
all_theta = np.zeros((num_labels, n+1))
# choose initial thetas
#init_theta = np.zeros((n+1, 1))
for i in np.arange(num_labels):
# note theta should be first arg in objective func signature followed by X and y
init_theta = np.zeros((n+1,1))
theta = minimize(lrCostFunctionReg, x0=init_theta, args=(X, (y == i)*1, lmbda),
options={'disp':True, 'maxiter':50})
all_theta[i] = theta.x
return all_theta
我尝试过更改最小化方法,将迭代次数从低至 30 次更改为高达 1000 次。我还尝试过提供我自己的梯度函数。在所有情况下,例程确实最终提供了答案,但它是完全错误的。有人知道发生了什么事吗?
编辑: 函数是可微分的。这是成本函数,然后是它的梯度(未正则化,然后正则化)。
def lrCostFunctionReg(theta, X, y, lmbda):
m = X.shape[0]
# unregularized cost
h = sigmoid(X @ theta)
# calculate regularization term
reg_term = ((lmbda / (2*m)) * (theta[1:,].T @ theta[1:,]))
cost_reg = (1/m) * (-(y.T @ np.log(h)) - ((1 - y).T @ np.log(1 - h))) + reg_term
return cost_reg
def gradFunction(theta, X, y):
m = X.shape[0]
theta = np.reshape(theta,(theta.size,1))
# hypothesis as generated in cost function
h = sigmoid(X@theta)
# unregularized gradient
grad = (1/m) * np.dot(X.T, (h-y))
return grad
def lrGradFunctionReg(theta, X, y, lmbda):
m = X.shape[0]
# theta reshaped to ensure proper operation
theta = np.reshape(theta,(theta.size,1))
# generate unregularized gradient
grad = gradFunction(theta, X, y)
# calc regularized gradient w/o touching intercept; essential that only 1 index used
grad[1:,] = ((lmbda / m) * theta[1:,]) + grad[1:,]
return grad.flatten()
为了回答我自己的问题,问题原来是矢量形状问题。我喜欢二维编码,但 SciPy 优化例程仅适用于已“展平”为数组的列向量和行向量。多维矩阵还好,列向量和行向量太过分了。
例如,如果 y 是标签向量并且 y.shape 是 (400,1),则需要对 y 使用 y.flatten(),这会使 y.shape = (400,)。然后 SciPy 将在假设所有其他维度都有意义的情况下处理您的数据。
因此,如果您将 MATLAB 机器学习代码转换为 Python 的努力停滞不前,请检查以确保您已展平行和列向量,尤其是那些由梯度函数返回的向量。