Pytorch

Question

我对使用 Pytorch 的组合衍生品很感兴趣：

在下面的实现代码中，我已经尝试过，但是代码计算了两个偏导数（例如，它首先计算 d'f/d'x，然后计算 d'f/d'y）。是否可以以某种方式修改代码，以便我们可以根据两个参数计算该导数？

import torch
def function(x,y):
    f = x**3+y**3
    return f

a =  torch.tensor([4., 5., 6.], requires_grad=True)
b =  torch.tensor([1., 2., 6.], requires_grad=True)
derivative = torch.autograd.functional.jacobian(function, (a,b))
print(derivative)

提前致谢！

Answer 1

您可以使用 torch.autograd.functional.hessian 来获得组合导数。

>>> f = lambda x, y: (x**3 + y**3).mean()
>>> H = A.hessian(f, (a, b))

因为你有两个输入，结果将是一个 tuple 包含 2 tuples.

更准确地说，您将拥有

H[0][0] 二阶导数 w.r.t x: d²z_i/dx_j*dx_j
H[0][1]组合导数w.r.tx和y：d²z_i/dx_j*dy_j
H[0][1]组合导数w.r.ty和x：d²z_i/dy_j*dx_j
H[1][1] 二阶导数 w.r.t y: d²z_i/dy_j*dy_j

>>> H
((tensor([[ 8.,  0.,  0.],
          [ 0., 10.,  0.],
          [ 0.,  0., 12.]], 
  tensor([[ 0.,  0.,  0.],
          [ 0.,  0.,  0.],
          [ 0.,  0.,  0.]]),
 (tensor([[ 0.,  0.,  0.],
          [ 0.,  0.,  0.],
          [ 0.,  0.,  0.]]))
  tensor([[ 2.,  0.,  0.],
          [ 0.,  4.,  0.],
          [ 0.,  0., 12.]])

确实，如果您查看组合导数：d²(x³+y³)/dxdy = d(3x²)/dy = 0，因此 H[0][1] 和 H[1][0] 是零矩阵。

另一方面，我们有 d²x³/d²x = 6x，因为 f 是对值进行平均，所以它给出 6x/3 = 2x。同样，你得到 d²x³/d²y = 6y.

因此，您发现 H[0][0] = diag(2a) 和 H[1][1] = diag(2b)。

Pytorch - 如何区分两个参数

Pytorch - How differentiate wrt two parameters

python

gradient

derivative

tensor