在 Julia 中使用 @generated 宏进行渐变的符号

Symbol to gradient with @generated macro in Julia

出于性能原因,我需要梯度和 Hessians 函数的执行速度与用户定义的函数一样快(例如,ForwardDiff 库使我的代码明显变慢)。然后我尝试使用 @generated 宏进行元编程,用一个简单的函数

进行测试
using Calculus
hand_defined_derivative(x) = 2x - sin(x)

symbolic_primal = :( x^2 + cos(x) )
symbolic_derivative = differentiate(symbolic_primal,:x)
@generated functional_derivative(x) = symbolic_derivative

这正是我想要的:

rand_x = rand(10000);
exact_values = hand_defined_derivative.(rand_x)
test_values = functional_derivative.(rand_x)

isequal(exact_values,test_values)        # >> true

@btime hand_defined_derivative.(rand_x); # >> 73.358 μs (5 allocations: 78.27 KiB)
@btime functional_derivative.(rand_x);   # >> 73.456 μs (5 allocations: 78.27 KiB)

我现在需要将其推广到具有更多参数的函数。明显的外推是:

symbolic_primal = :( x^2 + cos(x) + y^2  )
symbolic_gradient = differentiate(symbolic_primal,[:x,:y])

symbolic_gradient 的行为符合预期(就像在一维情况下一样),但是 @generated 宏没有像我认为的那样响应多维:

@generated functional_gradient(x,y) = symbolic_gradient
functional_gradient(1.0,1.0)

>> 2-element Array{Any,1}:
    :(2 * 1 * x ^ (2 - 1) + 1 * -(sin(x)))
    :(2 * 1 * y ^ (2 - 1))

也就是说,它不会将符号转换为生成的函数。有解决这个问题的简单方法吗?

P.S.: 我知道我可以将每个参数的导数定义为一维函数并将它们捆绑在一起形成一个梯度(这就是我我目前正在做),但我相信一定有更好的方法。

首先,我认为您不需要在这里使用 @generated:这是一个“简单”的代码生成案例,我认为使用 @eval 更简单且不那么令人惊讶.

所以一维的情况可以这样重写:

julia> using Calculus

julia> symbolic_primal = :( x^2 + cos(x) )
:(x ^ 2 + cos(x))

julia> symbolic_derivative = differentiate(symbolic_primal,:x)
:(2 * 1 * x ^ (2 - 1) + 1 * -(sin(x)))

julia> hand_defined_derivative(x) = 2x - sin(x)
hand_defined_derivative (generic function with 1 method)

# Let's check first what code we'll be evaluating
# (`quote` returns the unevaluated expression passed to it)
julia> quote
           functional_derivative(x) = $symbolic_derivative
       end
quote
    functional_derivative(x) = begin
            2 * 1 * x ^ (2 - 1) + 1 * -(sin(x))
        end
end

# Looks OK => let's evaluate it now
# (since `@eval` is macro, its argument will be left unevaluated
#  => no `quote` here)
julia> @eval begin
           functional_derivative(x) = $symbolic_derivative
       end
functional_derivative (generic function with 1 method)
julia> rand_x = rand(10000);
julia> exact_values = hand_defined_derivative.(rand_x);
julia> test_values = functional_derivative.(rand_x);

julia> @assert isequal(exact_values,test_values)

# Don't forget to interpolate array arguments when using `BenchmarkTools`
julia> using BenchmarkTools
julia> @btime hand_defined_derivative.($rand_x);
  104.259 μs (2 allocations: 78.20 KiB)

julia> @btime functional_derivative.($rand_x);
  104.537 μs (2 allocations: 78.20 KiB)

现在 2D 情况不起作用,因为 differentiate 的输出是一个表达式数组(每个组件一个表达式),您需要将其转换为构建数组(或元组,性能)的组件。这是下面示例中的 symbolic_gradient_expr

julia> symbolic_primal = :( x^2 + cos(x) + y^2  )
:(x ^ 2 + cos(x) + y ^ 2)

julia> hand_defined_gradient(x, y) = (2x - sin(x), 2y)
hand_defined_gradient (generic function with 1 method)

# This is a vector of expressions
julia> symbolic_gradient = differentiate(symbolic_primal,[:x,:y])
2-element Array{Any,1}:
 :(2 * 1 * x ^ (2 - 1) + 1 * -(sin(x)))
 :(2 * 1 * y ^ (2 - 1))

# Wrap expressions for all components of the gradient into a single expression
# generating a tuple of them:
julia> symbolic_gradient_expr = Expr(:tuple, symbolic_gradient...)
:((2 * 1 * x ^ (2 - 1) + 1 * -(sin(x)), 2 * 1 * y ^ (2 - 1)))

julia> @eval functional_gradient(x, y) = $symbolic_gradient_expr
functional_gradient (generic function with 1 method)

与一维情况一样,这与手写版本的表现相同:

julia> rand_x = rand(10000); rand_y = rand(10000);
julia> exact_values = hand_defined_gradient.(rand_x, rand_y);
julia> test_values = functional_gradient.(rand_x, rand_y);

julia> @assert isequal(exact_values,test_values)

julia> @btime hand_defined_gradient.($rand_x, $rand_y);
  113.182 μs (2 allocations: 156.33 KiB)

julia> @btime functional_gradient.($rand_x, $rand_y);
  112.283 μs (2 allocations: 156.33 KiB)