Octave fminunc 不收敛

Question

我试图在 Octave 中使用 fminunc 来解决逻辑问题，但它不起作用。它说我没有定义变量，但实际上我定义了。如果我直接在 costFunction 中定义变量，而不是在 main 中，它不会产生任何问题，但该函数实际上不起作用。事实上 exitFlag 等于 -3 并且根本不收敛。

这是我的函数：

function [jVal, gradient] = cost(theta, X, y)

X = [1,0.14,0.09,0.58,0.39,0,0.55,0.23,0.64;1,-0.57,-0.54,-0.16,0.21,0,-0.11,-0.61,-0.35;1,0.42,0.45,-0.41,-0.6,0,-0.44,0.38,-0.29];
y = [1;0;1];
theta = [0.8;0.2;0.6;0.3;0.4;0.5;0.6;0.2;0.4];


jVal = 0;
jVal = costFunction2(X, y, theta);   %this is another function that gives me jVal. I'm quite sure it is 
                                     %correct because I use it also with other algorithms and it  
                                     %works perfectly
m = length(y);
xSize = size(X, 2);
gradient = zeros(xSize, 1);
sig = X * theta;
h = 1 ./(1 + exp(-sig));
  
  for i = 1:m
  
    for j = 1:xSize    
    gradient(j) =  (1/m) * sum(h(i) - y(i)) .* X(i, j);      
    end
    
   end

这是我的主要内容：

theta = [0.8;0.2;0.6;0.3;0.4;0.5;0.6;0.2;0.4];
options = optimset('GradObj', 'on', 'MaxIter', 100);
[optTheta, functionVal, exitFlag] = fminunc(@cost, theta, options)

如果我编译它：

optTheta =

   0.80000
   0.20000
   0.60000
   0.30000
   0.40000
   0.50000
   0.60000
   0.20000
   0.40000

functionVal =  0.15967
exitFlag = -3

我该如何解决这个问题？

Answer 1

您实际上没有正确使用 fminunc。来自文档：

 -- fminunc (FCN, X0)
 -- fminunc (FCN, X0, OPTIONS)

     FCN should accept a vector (array) defining the unknown variables,
     and return the objective function value, optionally with gradient.
     'fminunc' attempts to determine a vector X such that 'FCN (X)' is a
     local minimum.

您传递的不是接受单个向量参数的函数句柄。相反，您传递的（即 @cost）是一个函数句柄，该函数接受三个个参数。

您需要将此 'convert' 转换为仅需要一个输入的函数句柄，并在后台执行您想要的操作。最简单的方法是将成本函数 'wrapping' 转换为只接受一个参数的匿名函数，并以适当的方式调用 cost 函数，例如

fminunc( @(t) cost(t, X, y), theta, options )

_{注意：这假设 X 和 y 是在您执行此操作的范围内定义的 'wrapping' 业务}

Octave fminunc 不收敛

Octave fminunc doesn't converge

machine-learning

octave