fmin_l_bfgs_b 输出的最小值处的梯度不为零
The gradient at the minimum outputted by fmin_l_bfgs_b are not zero
我正在使用 fmin_l_bfgs_b 来逼近函数的最小值。问题不受限制。我正在使用 "approx_grad" 在数字上获得最小值。
weights_sp_new, func_val, info_dict = fmin_l_bfgs_b(func_to_minimize, self.w_vectors[si][pj],
args=(self.sigma_vector[si][pj], Y, X, E_step_results[si][pj]),
approx_grad=True, factr=10000000.0, pgtol=1e-05, epsilon=1e-04)
我在相同的 objective 函数上尝试了不同的初始猜测。输出的信息字典如下:
information dictionary: {'nit': 180, 'funcalls': 4480, 'warnflag': 0,
'task': b'CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH',
'grad': array([ 1.69003327e+00, 2.29250366e+00, 1.55528930e+00,
9.84251656e-01, -1.10133624e-02, 1.83795773e+00,
6.44715933e-01, 2.01643592e+00, 8.71323232e-01,
9.93009353e-01, 1.34615338e+00, 4.20859578e-04,
-2.22691328e-01, -2.13318804e-01, -4.38475622e-01,
4.79004570e-01, -4.11879746e-01, 1.71003313e+00])}
information dictionary: {'nit': 0, 'funcalls': 20, 'warnflag': 0,
'task': b'CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL',
'grad': array([ 1.84672949e-20, 1.49550746e-20, 1.11115003e-20,
2.73908962e-20, 0.00000000e+00, 2.62916240e-20,
0.00000000e+00, 4.95859400e-20, 4.70618521e-20,
4.77249742e-20, 2.80864703e-20, 0.00000000e+00,
1.84975333e-21, 7.63125358e-21, 1.35733459e-20,
6.34943656e-21, 1.02743864e-20, 5.31287405e-20])}
information dictionary: {'nit': 107, 'funcalls': 2460, 'warnflag': 0,
'task': b'CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH',
'grad': array([ -3.09184019, -0.70217764, 0.72096009, -3.23745189,
-1.18111435, -4.13185742, 3.90762754, 2.28011806,
-3.02289147, -1.21219666, 1.80007832, -12.44630606,
-1.59126124, 1.59139978, -1.96677574, -0.50837465,
1.20439043, -1.58858602])}
information dictionary: {'nit': 132, 'funcalls': 2980, 'warnflag': 0,
'task': b'CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH',
'grad': array([ -8.56568098, -9.39712794, -8.82591339, -8.61912864,
-0.53956945, -9.46679887, 0.89827947, -10.64991782,
-6.53652169, -7.34566878, -8.98861319, 1.28335021,
-2.39830071, -1.2056133 , -0.81190425, -1.3537686 ,
-1.65028498, -8.30791505])}
可以看到收敛成功了。但是最小值的梯度不为零。我知道这意味着我没有得到确切的最小值。它可以进一步下降。我现在应该怎么办?或者我可以只接受这个 "approximated" 的最低限度吗?
提供的样本中有两种情况:
你的算法的第二个 运行 很好地收敛,b'CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL'
正如你所看到的
'grad': 数组([ 1.84672949e-20, 1.49550746e-20, 1.11115003e-20,
2.73908962e-20, 0.00000000e+00, 2.62916240e-20,
0.00000000e+00, 4.95859400e-20, 4.70618521e-20,
4.77249742e-20, 2.80864703e-20, 0.00000000e+00,
1.84975333e-21、7.63125358e-21、1.35733459e-20、
6.34943656e-21、1.02743864e-20、5.31287405e-20])
基本为零(最多20位精度)。
其余案例由于函数值没有显着变化而终止,b'CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH'
,因此您可以执行以下一项(或多项)操作:
减少 fmin_l_bfgs_b
的 factr
参数,来自文档
factr : float
The iteration stops when (f^k -
f^{k+1})/max{|f^k|,|f^{k+1}|,1} <= factr * eps, where eps is the
machine precision, which is automatically generated by the code.
Typical values for factr are: 1e12 for low accuracy; 1e7 for moderate
accuracy; 10.0 for extremely high accuracy.
想想你的功能,也许可以简化一下?平台(非常平坦的表面)是否有问题 - 如果是,也许您可以更改定义以最小化影响?
- 计算分析梯度(从而提高精度)
- 更改
epsilon
,因为您的数值近似值可能不够
我正在使用 fmin_l_bfgs_b 来逼近函数的最小值。问题不受限制。我正在使用 "approx_grad" 在数字上获得最小值。
weights_sp_new, func_val, info_dict = fmin_l_bfgs_b(func_to_minimize, self.w_vectors[si][pj],
args=(self.sigma_vector[si][pj], Y, X, E_step_results[si][pj]),
approx_grad=True, factr=10000000.0, pgtol=1e-05, epsilon=1e-04)
我在相同的 objective 函数上尝试了不同的初始猜测。输出的信息字典如下:
information dictionary: {'nit': 180, 'funcalls': 4480, 'warnflag': 0,
'task': b'CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH',
'grad': array([ 1.69003327e+00, 2.29250366e+00, 1.55528930e+00,
9.84251656e-01, -1.10133624e-02, 1.83795773e+00,
6.44715933e-01, 2.01643592e+00, 8.71323232e-01,
9.93009353e-01, 1.34615338e+00, 4.20859578e-04,
-2.22691328e-01, -2.13318804e-01, -4.38475622e-01,
4.79004570e-01, -4.11879746e-01, 1.71003313e+00])}
information dictionary: {'nit': 0, 'funcalls': 20, 'warnflag': 0,
'task': b'CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL',
'grad': array([ 1.84672949e-20, 1.49550746e-20, 1.11115003e-20,
2.73908962e-20, 0.00000000e+00, 2.62916240e-20,
0.00000000e+00, 4.95859400e-20, 4.70618521e-20,
4.77249742e-20, 2.80864703e-20, 0.00000000e+00,
1.84975333e-21, 7.63125358e-21, 1.35733459e-20,
6.34943656e-21, 1.02743864e-20, 5.31287405e-20])}
information dictionary: {'nit': 107, 'funcalls': 2460, 'warnflag': 0,
'task': b'CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH',
'grad': array([ -3.09184019, -0.70217764, 0.72096009, -3.23745189,
-1.18111435, -4.13185742, 3.90762754, 2.28011806,
-3.02289147, -1.21219666, 1.80007832, -12.44630606,
-1.59126124, 1.59139978, -1.96677574, -0.50837465,
1.20439043, -1.58858602])}
information dictionary: {'nit': 132, 'funcalls': 2980, 'warnflag': 0,
'task': b'CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH',
'grad': array([ -8.56568098, -9.39712794, -8.82591339, -8.61912864,
-0.53956945, -9.46679887, 0.89827947, -10.64991782,
-6.53652169, -7.34566878, -8.98861319, 1.28335021,
-2.39830071, -1.2056133 , -0.81190425, -1.3537686 ,
-1.65028498, -8.30791505])}
可以看到收敛成功了。但是最小值的梯度不为零。我知道这意味着我没有得到确切的最小值。它可以进一步下降。我现在应该怎么办?或者我可以只接受这个 "approximated" 的最低限度吗?
提供的样本中有两种情况:
你的算法的第二个 运行 很好地收敛,
b'CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL'
正如你所看到的'grad': 数组([ 1.84672949e-20, 1.49550746e-20, 1.11115003e-20, 2.73908962e-20, 0.00000000e+00, 2.62916240e-20, 0.00000000e+00, 4.95859400e-20, 4.70618521e-20, 4.77249742e-20, 2.80864703e-20, 0.00000000e+00, 1.84975333e-21、7.63125358e-21、1.35733459e-20、 6.34943656e-21、1.02743864e-20、5.31287405e-20])
基本为零(最多20位精度)。
其余案例由于函数值没有显着变化而终止,
b'CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH'
,因此您可以执行以下一项(或多项)操作:减少
fmin_l_bfgs_b
的factr
参数,来自文档factr : float
The iteration stops when (f^k - f^{k+1})/max{|f^k|,|f^{k+1}|,1} <= factr * eps, where eps is the machine precision, which is automatically generated by the code. Typical values for factr are: 1e12 for low accuracy; 1e7 for moderate accuracy; 10.0 for extremely high accuracy.
想想你的功能,也许可以简化一下?平台(非常平坦的表面)是否有问题 - 如果是,也许您可以更改定义以最小化影响?
- 计算分析梯度(从而提高精度)
- 更改
epsilon
,因为您的数值近似值可能不够