实现最速下降算法，可变步长

Question

我正在尝试用编程语言 (C/C++/fortran) 实现最速下降算法。

例如 f(x1,x2) = x1^3 + x2^3 - 2*x1*x2

的最小化

估计起始设计点 x0，迭代计数器 k0，收敛参数公差 = 0.1。假设这个起点是 (1,0)
计算f(x1,x2)在当前点x(k)的梯度为grad(f)。我会在这里使用数值微分。

d/dx1 (f) = lim (h->0) (f(x1+h,x2) - f(x1,x2) )/h

这是梯度(f)=(3*x1^2 - 2*x2, 3*x2^2 - 2*x1)

grad(f) 在 (0,1) 是 c0 = (3,-2)
因为c0的L2范数>公差，我们进行下一步
方向d0 = -c0 = (-3,2)
计算步长a。最小化 f(a) = f(x0 + ad0) = (1-3a,2a) = (1-3a)^3 + (2a)^3 - 2(1- 3a)*(2a)。我没有保持恒定的步长。
更新：新[x1,x2] = 旧[x1,x2]x + a*d0。

我不明白第 5 步该怎么做。我有一个使用二分法的一维最小化程序，它看起来像：

program main()
    ...
    ...
    define upper, lower interval
    call function value
    ...calculations
    ...
    ...


function value (input x1in) (output xout)
    ...function is x^4 - 2x^2 + x + 10 
    xout = (xin)^4 - 2*(xin)^2 + (xin) + 10

在这种情况下，查看第 5 步，我无法传递符号 a。关于如何用编程语言实现算法的任何想法，尤其是第 5 步？请建议是否有完全不同的编程方式。我见过很多步长不变的程序，但我想在每一步都计算它。这个算法可以很容易地在 MATLAB ot python sympy 中使用符号来实现，但我不想使用符号。任何建议表示赞赏。谢谢。

Answer 1

如果 C++ 是一个选项，您可以利用 functors and lambdas。

让我们考虑一个我们想要最小化的函数，例如y = x² - x + 2。它可以表示为一个函数对象，它是一个class 重载 operator():

struct MyFunc {
    double operator()( double x ) const {
        return  x * x - x + 2.0;
    }
};

现在我们可以声明一个这种类型的对象，像函数一样使用它，并将它作为模板参数传递给其他模板函数。

// given this templated function:
template < typename F >
void tabulate_function( F func, double a, double b, int steps ) {
    //  the functor     ^^^^^^  is passed to the templated function
    double step = (b - a) / (steps - 1);

    std::cout << "    x          f(x)\n------------------------\n";
    for ( int i = 0; i < steps; ++i ) {
        double x = a + i * step,
               fx = func(x);
        //          ^^^^^^^ call the operator() of the functor
        std::cout << std::fixed << std::setw(8) << std::setprecision(3) << x
                  << std::scientific << std::setw(16) << std::setprecision(5)
                  << fx << '\n';
    }   
}

// we can use the previous functor like this:
MyFunc example;
tabulate_function(example, 0.0, 2.0, 21);

OP 的功能可以用类似的方式实现（给定一个助手 class 来表示 2D 点）：

struct MyFuncVec {
    double operator()( const Point &p ) const {
        return p.x * p.x * p.x  +  p.y * p.y * p.y  -  2.0 * p.x * p.y;
    }
};

该函数的梯度可以表示为（给定一个 class，它实现了一个 2D 向量）：

struct MyFuncGradient {
    Vector operator()( const Point &p ) {
        return Vector(3.0 * p.x * p.x  -  2.0 * p.y, 3.0 * p.y * p.y  -  2.0 * p.x);
    }
};

现在，OP题的第五步要求使用一维优化算法沿梯度方向最小化第一个函数，需要传递一个一维函数。我们可以使用 lambda 来解决这个问题：

    MyFuncVec funcOP;
    MyFuncGradient grad_funcOP; 
    Point p0(0.2, 0.8);
    Vector g = grad_funcOP(p0);

    // use a lambda to transform the OP function to 1D
    auto sliced_func = [&funcOP, &p0, &g] ( double t ) -> double {
        // those variables ^^^ ^^^ ^^ are captured and used
        return funcOP(p0 - t * g);
    };

    tabulate_function(sliced_func, 0, 0.5, 21);

实例HERE.

实现最速下降算法，可变步长

Implementing steepest descent algorithm, variable step size

c

c++

algorithm

numerical-methods

gradient-descent